Induction promoter gene and secretory signal gene usable in Schizosaccharomyces pombe, expression vectors having the same, and use thereof

ABSTRACT

An isolated DNA in an invertase gene from Schizosaccharomyces pombe, which is located in a region involved in catabolite repression. The DNA may be incorporated into cloning vector, particularly a vector containing a heterologous protein structural gene. The vector can be used to transform Schizosaccharomyces pombe. A heterologous protein may be produced by incubating the transformant and isolating the protein.

This application is a National Stage of International ApplicationPCT/JP98/04929, filed Oct. 30, 1998.

TECHNICAL FIELD

The present invention relates to an inducible promoter gene andsecretion signal gene for use in the fission yeast Schizosaccharomycespombe (hereinafter referred to as S. pombe), an expression vectorcontaining them and a process for producing a protein using them. Inparticular, it relates to a process for producing a desired proteinwherein the S. pombe invertase promoter is used to make it possible tocontrol the timing of the protein production by the presence or absenceof a specific nutrient through regulated gene expression, and a processfor secretory production of a desired protein by using the secretionsignal gene for the S. pombe invertase precursor.

BACKGROUND ART

S. pombe, despite being a eukaryote, has been studied extensively forits high versatility in genetics, molecular biology and cellular biologyas a unicellular organism (Nasim A. et al. eds., Molecular biology ofthe fission yeast, Academic Press, 1989). In its cultures,monosaccharides such as glucose and fructose are used as the main carbonsources. It is known that in a culture medium lacking thesemonosaccharides, expression of invertase, the enzyme that degradessucrose into glucose and fructose, is induced to secure the carbonsource necessary for its growth (Moreno S. et al., Arch Microbial. 142,370, 1985).

S. pombe invertase is and is a high-molecular weight glycoproteinlocated on the cell surface with a molecular weight of about 205000, 67%of which is attributed to sugar chains composed of equimolar amounts ofmannose and galactose residues. Molecular weight and amino acid studiesof the protein moiety of the pure enzyme and experiments usingantibodies have shown high similarlity between S. pombe invertase andthe invertase from the budding yeast Saccharomyces cerevisiae from theviewpoint of protein chemistry (Moreno S. et al., Biochem. J. 267, 697,1990). It is also known that a drop in glucose concentrationde-represses synthesis of invertase (Mitchinson J. et al., Cell Sci 5,373, 1969).

Induced invertase synthesis (de-repression) is also observed inSaccharomyces cerevisiae. Previous detailed studies on geneticregulation of invertase expression, the biosynthetic pathway and thestructure of the sugar chain moiety have shown that Saccharomycescerevisiae invertase is encoded by six overlapping genes, SUC1 to SUC5and SUC7, on one chromosome and that activation of at least one of theseSUC genes leads to utilization of sucrose and raffinose (Hohmann S. etal., Curr Genet 11, 217, 1986).

In contrast, with respect to S. pombe, although purification of theinvertase protein has been reported (Moreno S. et al., 1985), noinvertase genes had been identified until the present inventors andcoworkers recently reported two overlapping invertase genes inv0⁺ andinv1⁺ in S. pombe. Because inv0⁺ is likely a pseudogene having anincomplete open reading frame, inv1⁺ is the only one gene encoding S.pombe invertase, which is supposed to confer the ability to grow onsucrose even in the absence of other carbon sources ("Kobogaku" editedby Yositaka Hashitani, Iwanami Shoten, 1967).

Analysis of the promoter region of the isolated gene suggested that aspecific sequence between the 1st and 62nd base pairs is involvedcatabolite repression.

In Saccharomyces cerevisiae, the SUC2 gene is transcribed into twomessenger RNAs (mRNAs) from different transcription initiation sites.The shorter one is a constitutive mRNA encoding the intracellularinvertase, while the longer one is a mRNA encoding thecatabolite-repressible secretory invertase with a de-repression ratio ofnot less than 200 (Carlson M. et al., Mol. Cell. Biol. 3, 439, 1983).Analysis of the promoter region for the longer mRNA suggested that thetranscription initiation factor binds to a specific repeated sequencebetween positions -650 and -418 (Salokin L et al., Mol. Cell. Biol. 6,2314, 1986). The region between positions -418 and -140 has been shownto be necessary for glucose repression.

These regions in the SUC2 gene showed no significant homology with theinv1⁺ upstream region between positions 1 and 2809. However, multiplecopies of a so-called 7-bp motif with the sequence (A/C)(A/G)GAAAT,which is repeated at five sites in the region indispensable for glucosederepression, have been found in the inv1⁺ upstream region. Further,while palindrome stem-loops have been identified at almost the samepositions in the upstream regions of glucose-repressible genes (SUC, MALand GAL), palindrome sequences have also been found in the upstreamregion of the inv1⁺ gene from S. pombe. These sequences are anticipatedto play an important role in glucose repression in S. pombe.

The yeast S. pombe is phylogenetically different from Saccharomycescerevisiae. It is quite different from other yeasts in the chromosomestructure and various mechanisms for genome replication, RNA splicing,transcription and posttranslational modification, and rather resemblesanimal cells in some of these aspects. For this reason, S. pombe iswidely used as a eukaryotic model (Giga-Hama and Kumagai, eds., Foreigngene expression in fission yeast Schizosaccharomyces pombe,Springer-Verlag, 1997).

S. pombe is also widely used as a host for expression of heterologousprotein genes and known to be suited especially for expression of genesfrom animals including human (JP-A-5-15380 and JP-A-7-163373). For itsadvanced membrane structures including the Goldi body and theendoplasmic reticulum, S. pombe is also used for expression of membraneproteins and shows high level expression. For S. pombe, constitutiveexpression vectors (pEVP11, pART1 and pTL2M) and an inducible expressionvector using the promoter region of the nmt1⁺ gene (pREP1) are usuallyused as expression vectors. No S. pombe expression vectors of the GALtype or the SUC type have been known though these types of vectors arecommonly used for Saccharomyces cerevisiae.

The expression of the SUC2 gene from Saccharomyces cerevisiae in S.pombe has been shown to be constitutive, not catabolite repressible,though the expression product contains galactose residues conferred bythe host (Zarate, V. et al., J Applied Bacteriology, 80, 45, 1996),suggesting differences between S. pombe and Saccharomyces cerevisiae inthe mechanism for catabolite repression of invertase. The differencesare of great significance because the promoter from Saccharomycescerevisiae usually used by those skilled in the art for construction ofinducible expression vectors of the invertase type (the SUC2 type) isnot applicable to S. pombe vectors. Therefore, development of S. pombevectors of this type has been long delayed.

On the other hand, the present inventors constructed an expressionvector using the secretion signal gene encoding the secretion signal inthe precursor of a S. pombe mating pheromone (WO96/23890). However, thissecretion signal gene is not an all-purpose secretion signal gene, andother secretion signal genes that function in S. pombe are desired forproduction of some types of protein.

DISCLOSURE OF THE INVENTION

As a result of their extensive research with a view to solving the aboveproblems, the present inventors have accomplished the present inventionby preparing a new clone of the S. pombe invertase gene and constructingan inducible expression vector. They have also found that the N-terminal22 amino acid sequence in the amino acid of the invertase precursorfunctions as a secretion signal. On the basis of these findings, theyhave constructed an expression vector using the secretion signal geneand established secretory production of desired proteins.

The present invention relates to a region in the invertase gene from S.pombe, which is involved in catabolite repression, an inducibleexpression vector using the region and a system using it forheterologous gene expression and provides:

a DNA in an invertase gene from Schizosaccharomyces pombe, which islocated in a region involved in catabolite repression,

a DNA having the base sequence of bases 1 to 2809 in SEQ ID NO: 1 in theSequence Listing,

a recombinant vector containing the sequence of the DNA,

a multicloning vector containing the sequence of the DNA and amulticloning site,

a multicloning vector having the structure shown in FIG. 9,

an expression vector for transformation of Schizosaccharomyces pombecontaining the sequence of the DNA and a heterologous protein structuralgene,

a transformant from Schizosaccharomyces pombe containing the expressionvector, and

a process for producing a protein which comprises incubating thetransformant and recovering an expressed heterologous protein.

Firstly, the present inventors cloned and sequenced a S. pombe invertasegene, which had not been genetically identified. Then, they demonstratedby gene disruption analysis that the invertase gene is responsible forthe overall invertase activity. Further, they identified the regioninvolved in catabolite repression and constructed an inducibleexpression vector using the region. They actually constructed arecombinant vector carrying the gene of a green fluorescent protein,transformed S. pombe with the vector and confirmed the expression of theprotein by assay of invertase activity and immunological analysis. Theyalso demonstrated repression of the heterologous gene expression in thepresence of glucose in the culture medium and derepression by exhaustionof glucose.

The present inventors used the following procedure to identify andcharacterize the gene of the S. pombe invertase precursor:

(1) PCR using a cDNA library from S. pombe as a template and primersbased on conserved amino acid sequences in invertase genes from manyother organisms;

(2) screening of a genomic library from S. pombe by plaque hybridizationusing the PCR product as a probe for positive clones;

(3) confirmation of the positive clones by restriction digestionfollowed by electrophoresis;

(4) Southern hybridization analysis and total sequencing of a fragmentwith a specific length in the positive clones;

(5) gene disruption analysis of invertase activity;

(6) investigation of the optimum pH for expression of the invertase genefrom S. pombe and the effects of the glucose concentration on glucoserepression and derepression; and

(7) identification of a region indispensable for glucose repressionthrough subcloning of the related upstream region.

Also, the present inventors constructed a S. pombe invertase inducibleexpression vector by the following procedure and actually demonstratedinducible expression of a green fluorescent protein:

(1) construction of an inducible multicloning expression vector pRI0Mcontaining an invertase promoter by modifying a S. pombe multicloningvector, pTL2M (JP-A-7-163373);

(2) construction of an inducible expression vector pRI0EGFP forexpression of a green fluorescent protein variant from the induciblemulticloning vector, pRI0M;

(3) transformation of a wild-type S. pombe strain with the inducibleexpression vector, pRI0EGFP, for expression of the green fluorescentprotein variant;

(4) demonstration of the expression of the green fluorescent proteinvariant by activity (fluorescence) analysis and SDS-PAGE-westernblotting; and

(5) establishment of the conditions for the inducible expression on thebasis of the dependence of the expression level on the glucoseconcentration in the culture medium.

SEQ ID NO: 1 in the Sequence Listing is the base sequence of the gene ofthe invertase precursor, which contains a region involved in cataboliterepression. The region involved in catabolite repression is the DNAsequence between positions 1 to 2809 of SEQ ID NO: 1 or within the DNAsequence. In the DNA sequence between positions 1 and 2809 of SEQ ID NO:1, the region extending from position 1 to position 620 of SEQ ID NO: 1and the region extending from position 1610 to position 2610 of SEQ IDNO: 1 are especially important, as is evident from the results of theanalysis in Example 6 shown in FIG. 8 (position 2810 in SEQ ID NO: 1corresponds to position 1 in FIG. 8). This means that the induciblepromoter in the present invention is not restricted to a DNA having thebase sequence from position 1 to position 2809 of SEQ ID NO: 1 so longas it contains these genes involved in catabolite repression andfunctions as an inducible promoter. Still, a DNA having a base sequencefrom position 1 to position 2809 of SEQ ID NO: 1 is preferable as aninducible promoter because it actually functions in S. pombe.

The above-mentioned DNA which contains genes involved in cataboliterepression and function as an inducible promoter, preferably having thebase sequence from position 1 to position 2809 of SEQ ID NO: 1, ishereinafter referred to as the inducible promoter gene. The induciblepromoter gene can be integrated with a vector for construction ofrecombinant vectors such as multicloning vectors and expression vectors.A multicloning vector is a vector having a multicloning site andprovides an expression vector through introduction of a desiredstructural gene into the multicloning site. An expression vector is avector containing a structural gene and used for expression of astructural gene encoding a heterologous protein. A "heterologous"protein is a protein which is not inherent in the host. For example,when the host is S. pombe, a heterologous protein is a protein which isnot inherent in S. pombe (such as a human protein).

In the expression vector, the inducible promoter gene is locatedupstream from the heterologous protein structural gene and regulatesexpression of the structural gene. The inducible promoter gene in theexpression vector regulates the expression of the heterologous proteinstructural gene downstream, like the inducible promoter gene, locatedupstream in the base sequence represented by SEQ ID NO: 1 regulates theexpression of the structural gene of the invertase precursor. In themulticloning vector, the inducible promoter gene is located upstreamfrom the multicloning site into which a heterologous protein structuralgene is to be introduced.

One example of the multicloning vector of the present invention is themulticlonig vector pRI0M constructed in Example 9 and has the structureshown in FIG. 9. The entire base sequence of pRI0M is SEQ ID NO: 3.Inv1-P is the above-mentioned inducible promoter gene, and MCS is themulticloning site. One example of the expression vector of the presentinvention is the inducible expression vector pRI0EGFP for expression ofa green fluorescent protein variant constructed in Example byintroducing the structural gene (EGFP-ORF) of a green fluorescentprotein variant and has the structure shown in FIG. 10. The entire basesequence of the expression vector pRI0EGFP is SEQ ID NO: 14.

The most suitable cell (host) to transform with the expression vector ofthe present invention is S. pombe because the inducible promoter gene inthe present invention is an inducible promoter gene from S. pombe.

Under catabolite repressing conditions (for example, in a culture mediumcontaining a high level of glucose), S. pombe transformed with theexpression vector of the present invention grows with no (or low)expression of the heterologous protein. Growth at this stage without theburden of heterologous protein expression is more efficient than growthunder the burden. Subsequent incubation under catabolite derepressingconditions (for example, in a culture medium containing no or a lowlevel of glucose) invites the increased number of S. pombe cells to highlevel expression of the heterologous protein, though growth of S. pombeis less efficient than under catabolite repressing conditions. Thus,controlled transition between growth of S. pombe and heterologousprotein expression through catabolite repression allows more efficientproduction of a heterologous protein.

Catabolite repression can be controlled not only in an active way asdescribed above but also in a passive way. For example, when a S. pombetransformant is incubated in a culture medium containing a given amountof glucose, the S. pombe grows under catabolite repressing conditionscontaining a high level of glucose in the initial stage, but later onproduction of a heterologous protein predominates due to catabolitederepression as glucose is exhausted. This way, more efficientheterologous protein production of than ever is possible without activecontrol of the glucose level.

In addition to the above-mentioned total sequencing of the invertaseprecursor gene from S. pombe, the present inventors determined thecomplete amino acid sequence of the invertase precursor (the amino acidsequence in SEQ ID NO: 2). Then, they have found that the first 22 aminoacid peptide in the amino acid sequence of the invertase precursor (MetPhe Leu Lys Tyr Ile Leu Ala Ser Gly Ile Cys Leu Val Ser Leu Leu Ser SerThr Asn Ala) (amino acids 1-22 of SEQ ID NO: 2) acts as a secretionsignal. Hereinafter, the peptide is referred to as the secretion signal.

It is expected that a desired heterologous protein produced intransformed S. pombe cells as a protein fusion having the secretionsignal at the N-terminal is secreted from the cells after intracellularprocessing which splits the protein fusion into the secretion signal andthe heterologous protein. The present inventors constructed anexpression vector carrying a heterologous protein structural gene(specifically, human interleukin 6-a'c1 variant) fused with a DNAencoding the secretion signal (namely, a structural gene of a proteinfusion as mentioned above) and demonstrated secretion of theheterologous protein from S. pombe cells transformed with the expressionvector.

The present invention provides the secretion signal, a DNA encoding thesecretion signal (hereinafter referred to as a secretion signal gene), arecombinant vector carrying the secretion signal gene, a multicloningvector carrying the secretion signal gene, an expression vector carryingthe secretion signal gene and a heterologous protein structural gene fortransformation of S. pombe, a S. pombe transformant carrying theexpression vector and a process for producing a protein which comprisesincubating the transformant and recovering the expressed heterologousprotein.

The secretion signal gene is not restricted to the 66-bp sequenceextending from position 2810 to position 2875 in SEQ ID NO: 1 and may bea DNA having a different base sequence encoding the amino acid sequenceof the secretion signal. In the expression vector, the secretion signaland the heterologous protein structural gene is preferably linkeddirectly. But they may be linked via another DNA sequence, for example,extending from position 2876 in SEQ ID NO: 1. In this case, the proteinproduct has extra amino acid residues at the N-terminal of theheterologous protein but can be converted into the desired heterologousprotein by trimming off the N-terminal extra amino acid residues.However, the disadvantage from the presence of these extra amino acidresidues usually becomes more serious for the desired protein as thenumber of extra amino acid residues increases. Therefore, as theintervening DNA between the secretion signal gene and the heterologousprotein structural gene, a short DNA encoding at most 10 amino acidresidues is preferable. Particular preferably, the secretion signal geneand the heterologous protein structural gene are linked directly.

Construction of an expression vector carrying the secretion signal geneusing a multicloning vector can be attained by inserting a heterologousprotein structural gene fused with the secretion signal gene into themulticloning site of a multicloning vector or by inserting aheterologous protein structural gene into the multicloning site of amulticloning vector carrying the secretion signal gene. The lattermethod tends to restrict the structure of the multicloning site becausethe secretion signal gene is preferred to be located immediately infront of the multicloning site as described above. Therefore, the formermethod is preferred for construction of an expression vector. As a S.pombe multicloning vector, for example, pTL2M, which is disclosed inJP-A-163373, is preferable.

According to the present invention, an expression vector can beconstructed by using both the inducible promoter gene and the secretionsignal gene. For example, an expression vector which contains the DNAsequence of from position 1 to position 2875 in SEQ ID No: 1 and aheterologous protein structural gene introduced downstream of the DNAsequence can be constructed. Such an expression vector enablescatabolite repressible secretory production of a heterologous protein bythe host cell. A similar expression vector can be constructed by using aknown secretion signal gene (such as the secretion signal gene disclosedin WO96/23890) instead of the above-mentioned secretion signal gene.

BRIEF DESCRIPTION OF DRAWINGS

The following drawings are presented in connection with the section ofBest Mode for Carrying Out the Invention.

FIG. 1 shows a comparison of (partial) amino acid sequences deduced frominv1* (amino acids 58-393 of SEQ ID NO: 2), the Schwanniomycesoccidentalis invertase gene (SEQ ID NO: 23) and the fission yeast SUC2gene (SEQ ID NO: 24).

FIG. 2 is the restriction map of the inv1* gene.

FIG. 3 electrophoretically shows disruption of the inv1* gene.

FIGS. 4(a)-4(b). FIG. 4(a) is a photograph of colony gel overlay assayof invertase activity (for phenotype characterization) as a substitutefor a drawing. FIG. 4(b) is a schematic explanation of experimentaldesign of the invertase activity assay shown in FIG. 4(a).

FIGS. 5(a)-5(b). FIG. 5(a) is a photograph of colony gel overlay assayof invertase activity (for phenotype characterization) as a substitutefor a drawing. FIG. 5(b) is a schematic explanation of experimentaldesign of the invertase activity assay shown in FIG. 5(a).

FIG. 6 graphically shows the relation between the invertase activity andthe glucose concentration.

FIG. 7 graphically shows the relation between the invertase activity andthe glucose concentration.

FIG. 8 shows the results of analysis of invertase promoters.

FIG. 9 shows the structure of an inducible expression vector pRI0M.

FIG. 10 shows the structure of an inducible expression vector pRI0EGFPfor expression of a green fluorescent protein.

FIGS. 11(a) and 11(b) demonstrate the expression of a green fluorescentprotein.

FIGS. 12(a) and 12(b) graphically show the relation between theincubation time and the expression level of the green fluorescentprotein.

FIGS. 13(a) and 13(b) graphically show the relation between theincubation time and the expression level of the green fluorescentprotein.

FIGS. 14(a) and 14(b) show the relation between the incubation time andthe expression level of the green fluorescent protein.

FIG. 15 is a SDS-PAGE pattern obtained in analysis of the expression ofinterleukin 6a'c1 variant.

FIG. 16 is the western blot pattern of the expressed interleukinvariant.

BEST MODE FOR CARRYING OUT THE INVENTION

Now, the present invention will be described in further detail withreference to specific Examples.

EXAMPLE 1 Isolation of S. pombe Invertase Gene

PCR using a S. pombe cDNA library as the template and primers designedon the basis of conserved sequences in invertase genes from otherorganisms shown in SEQ ID NOS: 4 to 6, gave amplification products ofabout 300 bp and about 400 bp. Each PCR product was purified by usingEASY TRAP (Takara Shuzo Co., Ltd.) and sequenced after ligation into avector by using pMOS Blue T vector kit (Amersham Pharmacia BiotechK.K.). The deduced amino acid sequence indicated that part of the 400-bpPCR product has significant homology with the SUC2 gene fromSaccharomyces cerevisiae.

Screening of a genomic library of S. pombe for the entire invertase geneby plaque hybridization using the 400-bp PCR product as a probe pickedup 15 positive clones from about 8,000 plaques. Secondary screening ofthe positive clones by plaque hybridization left four positive clones.Treatment of small amounts of phage DNA extracts from the positiveclones with various restriction enzymes gave identical cleavagepatterns, and thus revealed that all the clones were identical.

The entire invertase gene was isolated by the following two-stepprocedure. A 3.0-kb HindIII fragment was isolated from thehybrid-forming clones and ligated into a plasmid pBluescript II SK-(Toyobo Co., Ltd.). Restriction mapping identified BamHI and SalI sites.Subcloning of the fragment using these restriction enzymes andsubsequent sequencing using a deletion technique revealed that theHindIII fragment contains the complete ORF of the gene but contains onlyabout 200 bp within the upstream region, which is supposedly involved inthe gene expression. Therefore, separately, a 3.5-kb BamHI fragment fromthe hybrid-forming clones was further digested with HindIII to give a2.6-kb fragment. Subcloning of the 2.6-kb fragment in plasmidpBluescript II SK-, and subsequent sequencing using a deletion techniquerevealed that the BamHi-HindIII fragment contained a sequence within theupstream region supposedly involved in the gene expression. Theresulting complete 5.6-kb gene was designated as inv1⁺. The basesequence of inv1⁺ and the amino acid sequence encoded by its ORF areshown in SEQ ID NO: 1 and 2. A plasmid carrying the complete gene wasdesignated as pINV3000.

These results suggest that the inv1⁺ product has 16 asparagine-linkedglycosylation sites. FIG. 1 shows a comparison of the amino acidsequence deduced from the base sequence of the inv1⁺ gene from S. pombe(SEQ ID NO: 2), the amino acid sequence of Schwanniomyces occidentalis(SEQ ID NO: 23) invertase and the amino acid sequence deduced from thebase sequence of the SUC2 gene from Saccharomyces cerevisiae (SEQ ID NO:24). Amino acids that are common to the three are marked with *. FIG. 1clearly shows the amino acid sequence deduced from the inv1⁺ basesequence has significant homology with invertases from other originssuch as Schwanniomyces occidentalis and Saccharomyces cerevisiae, whichsuggests the inv1⁺ encodes invertase.

EXAMPLE 2 Disruption of the inv1⁺ Gene

The HindIII site in plasmid pBluescript II SK- having a S. pombe ura4⁺gene insert at the ClaI site was disrupted by HindIII digestion followedby blunting and self-ligation (self-cyclizaion). Double restrictiondigestion of the plasmid with XbaI and HincII gave a ura4⁺ fragment. Theplasmid pBluescript was integrated with the ura4⁺ fragment afterrestriction digestion with SpeI and subsequent blunting and XbaIdigestion, to provide a plasmid having BamHI sites on both sides ofura4⁺. The BamHI site in plasmid pBluescript II SK- was disruptedsimilarly by restriction digestion followed by blunting andself-ligation, and the 3.0-kb fragment containing the inv1⁺ ORF wasinserted at the HindIII site. BamHI digestion of the plasmide eliminateda 1.4-kb fragment (containing part of the inv1⁺ ORF encoding theC-terminal of invertase) from the 3.0-kb insert, and a ura4⁺ cassettehaving BamHI sites at both ends was inserted. HindIII digestion of theresulting plasmid gave a DNA fragment having inv1⁺ neighboring regionsat both ends (FIG. 2). The restriction map of the inv1⁺ gene is shown inFIG. 2, wherein the open reading frame (ORF) is indicated by the arrow(inv1⁺ ORF) and the ura4⁺ replacement from Schizosaccharomyces pombe isboxed (ura4⁺). The disruption mutant strain had an inv1⁺ fragmentcarrying the S. pombe ura4⁺ gene instead of the 1.4-kb inv1⁺ BamHI-BamHIfragment partly containing the ORF. The inv1⁺ fragment was used totransform a wild-type S. pombe strain, TP4-1D [h⁻, leu1, ura4,ade6-M216, his2, obtained from Dr. Takashi Toda (Imperial CancerResearch Foundation)], and viable colonies on a uracil-free culturemedium were collected. Overlay assay of invertase activity revealed that7 out of 28 strains, namely 25% of the ura4⁺ colonies, lacked invertaseactivity.

Further, to verify the chromosomal inv1⁺ gene disruption, genomic DNAfrom a strain lacking invertase activity was analyzed after doublerestriction digestion with HindIII and SalI by Southern hybridizationusing the inv1⁺ HindIII-SalI fragment (2 kb) as the probe. The 3-kbhybridized fragment, which was not digested with SalI, shown in FIG. 3demonstrates that part of the inv1⁺ gene in the chromosomal DNA had beenreplaced with the ura4⁺ gene in the strain which lacked invertaseactivity.

Thus, the inv1⁺ gene proved to be the only one invertase gene expressedin S. pombe.

EXAMPLE 3 Restoration of Invertase Activity by the inv1⁺ Gene

The 3.0-kb HindIII fragment containing the entire inv1⁺ ORF from theinvertase gene was inserted into S. pombe vector pAU-SK (obtained fromDr. Chikashi Shimoda, Department of Science, Osaka City University), andthe resulting recombinant vector was used to transform theinvertase-defective strains (Example 2). The resulting transformantswere streaked on YP sucrose plates (supplemented with 10 μg/ml antimycinA and 20 μg/ml bromocresol purple) for overlay assay of invertaseactivity. Further, the 2.6-kb inv1⁺ BamHI-HindIII fragment containingthe upstream promoter region and the 2.0-kb HindIII-SalI fragment fromthe invertase gene containing the ORF were legated, and the resulting4.6-kb BamHI-SalI fragment was inserted into pAU-SK. Transformation ofthe resulting recombinant vector into the invertase-defective strain wasfollowed by similar overlay assay of invertase activity.

Both transformants (inv1Δ[pAU-SK::inv1⁺ ]) and the wild-type strainTP4-1D (WT) developed blue stains, which indicate invertase production,unlike the inv1⁺ disruption mutant strain (inv1Δ). The addition of theupstream promoter region resulted in stronger stains, which suggesthigh-level invertase expression (FIGS. 4(a) and 4(b)). FIG. 4(a) is aphotograph showing the results of the gel overlay assays, and FIG. 4(b)is a schematic explanation of the stained sections shown in FIG. 4(a).

The invertase-defective strain were hardly viable on YP sucrose plates(supplemented with 10 μg/ml antimycin) whereas the wild-type strain andthe transformants were recognizably viable after 5 days incubation at30° C. (FIGS. 5(a) and (b)). FIG. 5(a) is a photograph showing theresults of characterization by colony formation, and FIG. 5(b)schematically explains the characterization shown in FIG. 5(a).

These results demonstrate that the inv1⁺ gene expression product is theinvertase located on the cell surface.

EXAMPLE 4 Determination of Glucose Concentration for Gene Repression

For determination of the critical glucose concentration for cataboliterepression of the invertase gene, the wild-type strain TP4-1D wasincubated at 30° C. in 5 ml of MM medium containing 2%, 4%, 8% and 16%glucose with shaking to the mid-logarithmic growth phase. Invertaseassays were done by the method of Goldstein et al., and thepost-incubational glucose concentrations in the medium were determinedby the phenol-sulfate method (FIG. 6). The hatched bars indicateinvertase activity per cell (U/OD), and the empty bars indicate theresidual glucose concentration. Judging from the graph, a glucoseconcentration of 8% is the optimum for glucose repressing incubation,because when the glucose concentration was 8%, the invertase activitywas sufficiently repressed with little decrease of glucose.

EXAMPLE 5 Determination of Glucose Concentration for Induced InvertaseProduction

For determination of the most effective glucose concentration forinduced invertase production, the wild-type strain TP4-1D and atransformant [obtained by transforming the invertase-defective strain(Example 3) with a pAU-SK vector carrying the inv1⁺ BamHI-SalI fragment]were preincubated in a medium containing 2% glucose to themid-logarithmic growth phase and incubated in an MM medium containing0%, 0.01%, 0.05%, 0.1%, 0.25%, 0.5%, 1.0% and 2% glucose with shaking at27° C. for 3 hours, and the invertase activity was assayed (FIG. 7).Each run of assay was carried out at 30° C. over 180 minutes. 1 U ofinvertase converts 1 nmol of sucrose into glucose per 1 minute at 30°C., pH 4.0.

The optimum glucose concentration for induction was found to be 0.1% forthe wild-type strain and 0.05% for the transformant. The invertaseactivity in the wild-type strain was 40 times higher under derepressingconditions than under repressing conditions. These results demonstratecatabolite repression in S. pombe.

EXAMPLE 6 Analysis of the inv1⁺ Promoter Region

Fragments obtained ligating S. pombe inv1⁺ upstream sequences extendingfrom positions 1, 620, 1100, 1610 and 2610 of SEQ ID NO: 1,respectively, and the inv1⁺ ORF were inserted into expression vectorpAU-SK to obtain 5 plasmids for deletion studies. The plasmidscontaining the upstream sequences extending from positions 1, 1610 and2610 carry the inv1⁺ BamHI-SalI, SacI-SalI and HindIII-SalI fragments,respectively. The plasmids containing the inv1⁺ upstream sequencesextending from positions 620 and 1100 were constructed by site specificintroduction of a SpeI site into pAU-SK::inv1⁺ (BamHI-SalI) usingprimers shown in SEQ ID NOS: 7 and 8, respectively, followed by partialremoval of the upstream region by SpeI treatment.

The plasmids thus obtained were used to transform theinvertase-defective strain (Example 3). Invertase assays were done todetermine the enzyme activity in each transformant (FIG. 8). The resultssuggest that the sequence between position 1 and position 620 of SEQ IDNO: 1 is essential for glucose repression. The region between position1620 and position 2610 of SEQ ID NO: 1 was identified as essential forhigh-level glucose derepression of invertase.

EXAMPLE 7 Construction of Inducible Multicloning Expression Vector pRI0MCarrying the Invertase Promoter

PCR amplification using the plasmid pINV3000 (Example 1) carrying theinvertase gene from S. pombe as the template and oligo DNAs shown in SEQID NOS: 9 and 10 as the primers was performed to give a sequence whichcontains the promoter region for the invertase gene and has restrictionenzyme recognition sequences at both ends. After terminal doublerestriction digestion with SpeI (Takara Shuzo Co., Ltd.) and EcoRI(Takara Shuzo Co., Ltd.), the sequence was subjected to agarose gelelectrophoresis. Purification of the band of about 3000 bp by the glassbeads method using EASY-TRAP gave an insert fragment.

The S. pombe multicloning vector pTL2M (JP-A-7-163373) was subjected toagarose gel electrophoresis after terminal double restriction digestionwith SpeI and EcoI. Purification of the band of about 4500 bp by theglass beads method using EASY-TRAP gave a vector fragment.

The two fragments were ligated with a DNA ligation kit (Takara ShuzoCo., Ltd.) and transformed into E. coli strain DH (Toyobo Co., Ltd.). E.coli colonies were screened for the inducible expression vector pRI01Mshown in FIG. 9 and SEQ ID NO: 3 in the Sequence Listing through basesequencing and restriction mapping, and the inducible expression vectorwas recovered by the alkali-SDS method on a preparatory scale.

EXAMPLE 8 Construction of Inducible Expression Vector pRI0EGFP forExpression of Green Fluorescent Protein

PCR amplification using the plasmid pINV3000 (Example 1) carrying theinvertase gene from S. pombe as the template and oligo DNAs shown in SEQID NOS: 9 and 11 as the primers was performed to give a sequence whichcontains the promoter region for the invertase gene and has restrictionenzyme recognition sequences at both ends. After terminal doublerestriction digestion with SpeI and NheI (Takara Shuzo Co., Ltd.), thesequence was subjected to agarose gel electrophoresis. Purification ofthe band of about 3000 bp by the glass beads method using EASY-TRAP gavea fragment for use as the promoter insert.

PCR using the plasmid pEGFP carrying the jellyfish (Aequorea victria)green fluorescent protein variant gene (Clontech) as the template andoligo DNAs shown in SEQ ID NOS: 12 and 13 in the Sequence Listing as theprimers was performed to amplify the ORF in the green fluorescentprotein variant gene. The PCR product was subjected to agarose gelelectrophoresis after terminal double restriction digestion with NheIand HindIII. Purification of the band of about 700 bp by the glass beadsmethod using EASY-TRAP gave a fragment for use as the ORF insert.

The S. pombe multicloning vector pTL2M was cleaved by double restrictiondigestion with SpeI and HindIII and then subjected to agarose gelelectrophoresis. Purification of the band of about 4500 bp by the glassbeads method using EASY-TRAP gave a vector fragment.

The three fragments were ligated with a DNA ligation kit and transformedinto E. coli strain DH5. E. coli colonies were screened for theinducible expression vector pRI0EGFP for expression of the greenfluorescent protein variant shown in FIG. 10 and SEQ ID NO: 14 in theSequence Listing through base sequencing and restriction mapping, andthe inducible expression vector was recovered by the alkali-SDS methodon a preparatory scale.

EXAMPLE 9 Preparation of S. pombe Transformant ASP138

S. pombe wild-type strain ARC001 [leu1-32h⁻ (isogenic to ATCC38399)] wastransformed with the inducible expression vector pRI0EGFP for expressionof the green fluorescent protein variant and a transducing vector pAL7as described by Okazaki et al. (Okazaki et al., "Nucleic Acids Res.",18, 6485-6489, 1990).

1 ml of a preincubated ARC001 culture in YPD medium was incubated inminimum medium MB+Leu with shaking at 30° C. for 14 hours to a celldensity of 3×10⁷ per 1 ml, and the cells were collected, washed withwater, suspended in 1 ml of 0.1M lithium acetate (pH 5.0) and incubatedat 30° C. for 60 minutes. A 100 μl portion of the suspension was mixedwith 4 μg of the inducible expression vector pRI0EGFP and 0.5 μg ofPstI-digested pAL7 in 15 μl TE and further with 290 μl of 50% PEG 4000thoroughly and incubated at 30° C. for 60 minutes, at 43° C. for 15minutes and at room temperature for 10 minutes, successively. Aftercentrifugal removal of PEG, the cells were suspended in 1 ml of 1/2YEL+Leu medium. After 10-fold dilution, 1 ml of the suspension wasincubated at 32° C. for 2 hours, and a 300 μl portion was spread onminimum medium agar MMA. After 3 days of incubation at 32° C., about 300independent colonies had developed on the plate.

10 colonies of the transformant were inoculated in 2 ml of YEL mediumcontaining 10 μg/ml antibiotic G418 (YEL10 medium) and incubated withshaking at 32° C. 2 days later, 6 clones were viable. In theirsubcultures, 4 clones were viable 3 days later. The putative desiredtransformant (ASP138 strain) was frozen in glycerol and stored for usein subsequent experiments.

EXAMPLE 10 Analysis of Expression of Green Fluorescent Protein Variant

S. pombe transformant ASP138 (Example 9) was inoculated in YPD mediumcontaining 100 μg/ml G418 (YPD100) and incubated at 32° C. for 2 days.Green fluorescence was observed from each cell under a fluorescencemicroscope (excitation wavelength 490 nm/emission wavelength 530 nm).Green fluorescence emission was also observed upon ultravioletirradiation from the centrifugally collected cells. Thus, expression ofthe desired green fluorescent protein in the active form was confirmed.

Strain ASP138 (Example 9) was incubated in 5 ml YPD100 at 32° C. for 3days, collected, washed and suspended in 50 mM tris-HCl (pH 7.5),disrupted with glass beads in a mini bead beater (Biospec). Afterremoval of the glass beads, the cell extract was heated in the presenceof SDS (1%) at 80° C. for 15 minutes. Separately, a negative control wasextracted from the transformant carrying pR10M by the same procedure.

50 μg protein from the extract was analyzed by SDS-polyacrylamide gelelectrophoresis (FIGS. 11(a) and (b)). After Coomassie Brilliant Blue(CCB) staining, the extract and a recombinant green fluorescent protein(Clonetech) as the positive control showed major bands with a molecularweight of 25000, but the negative control did not. Further, 50 μgprotein from the extract was analyzed by SDS-polyacrylamide gelelectrophoresis followed by western blotting on a PVDF membrane using ananti-green fluorescent protein antibody (Clonetech). The cell extractand the positive control showed major bands with a molecular weight of25000, but the negative control did not. These results providebiological evidence of expression of the desired green fluorescentprotein.

EXMAPLE 11 Optimization of the Incubation Method (1)

The transformed S. pombe strain ASP (Example 9) was incubated in YPDmedium containing 100 μg/ml G418 (YPD100) at 32° C. The expression levelof the green fluorescent protein in the cell culture was determinedfluorometrically by means of a microplate reader (Corona Electric Co.,Ltd.) equipped with a fluorescent attachment (excitation wavelength 490nm/emission wavelength 530 nm) (FIGS. 12(a) and (b)). OD, FLU/OD andtime denote the cell density, the fluorescence intensity per cell andthe incubation time, respectively. The results show that strain ASP138did not express the green fluorescent protein variant until thelate-growth phase after glucose exhaustion in the mid-growth phase,unlike strain ASP122 having a non-inducible cytomegalovirus promoter [atransformant carrying a recombination product of phGFPS65T (Clonetech)for expression of a green fluorescent protein variant of S65T preparedas disclosed in JP-A-7-163373], clearly due to repression of theinducible invertase promoter in the presence of glucose and subsequentderepression by glucose exhaustion, demonstrating the applicability ofthis mechanism to expression of the green fluorescent protein as aheterologous protein.

EXAMPLE 12 Incubation of Strain ASP138 (2)

The transformed S. pombe strain ASP138 (Example 9) was incubated in YPDmedium containing 100 μg/ml G418 (glucose concentration 8%) at 32° C. tothe mid-growth phase, and after medium change, incubated in YPDG mediumcontaining 100 μg/ml G418 (glucose concentration 0.1%, glycerolconcentration 3%) (FIGS. 13(a) and (b)). OD, FLU/OD and time denote thecell density, the fluorescence intensity per cell and the incubationtime, respectively. The results show that while the green fluorescentprotein was not expressed in the cells incubated without medium change(Untreated), expression of the green fluorescent protein in the cellsincubated with the medium change was activated by the medium change(Medium-changed) probably because the depletion of glucose in the mediumprovoked derepression of the invertase promoter and thereby induced theprotein expression. The high level expression induced by the mediumchange to a low-glucose expression medium suggests that use of a growthmedium (with a high glucose concentration) and an expression medium(with a low glucose concentration) can differentiate between cell growthand protein expression. It was demonstrated that the repression of theinducible invertase promoter in the presence of glucose and derepressionby exhaustion of glucose could be utilized in expression of the greenfluorescent protein as a heterologous protein.

EXAMPLE 13 Incubation of Strain ASP138 (3)

The transformed S. pombe strain ASP138 (Example 9) was incubated in YPDmedium containing 100 μg/ml G418 (glucose concentration 8%) at 32° C. tothe late-growth phase, and after medium change, incubated in YPDG mediumcontaining 100 μg/ml G418 (glucose concentration 0.1%, glycerolconcentration 3%) (FIGS. 14(a) and (b)). OD, FLU/OD and time denote thecell density, the fluorescence intensity per cell and the incubationtime, respectively. The results show that while the green fluorescentprotein was not expressed in the cells incubated without medium change(Untreated), expression of the green fluorescent protein in the cellsincubated with the medium change was activated by the medium change(Medium-changed) probably because the depletion of glucose in the mediumprovoked derepression of the invertase promoter and thereby induced theprotein expression. The high level expression induced by the mediumchange to a low-glucose expression medium from a high-glucose mediumbefore glucose exhaustion suggests that use of a growth medium (with ahigh glucose concentration) and an expression medium (with a low glucoseconcentration) can differentiate between cell growth and proteinexpression. It was demonstrated that the repression of the inducibleinvertase promoter in the presence of glucose and derepression byexhaustion of glucose could be utilized in expression of the greenfluorescent protein as a heterologous protein.

EXAMPLE 14 Construction of Inducible Lipocortin I Expression VectorpRI0LPI

PCR using plasmid pINV3000 (Example 1) carrying the S. pombe invertasegene as the template and oligo DNAs shown in SEQ ID NOS: 15 and 16 inthe Sequence Listing as primers gave an amplification product containingthe promoter region in the invertase gene and having restriction enzymerecognition sequences at both ends. After terminal double restrictiondigestion with SpeI and EcoRI, the amplification product was subjectedto agarose gel electrophoresis. Purification of the band of about 3000bp by the glass beads method using EASY-TRAP gave a fragment for use asa promoter insert.

The expression vector pTL2L (JP-A-7-163373) carrying a human lipocortinI gene was subjected to agarose gel electrophoresis after terminaldouble restriction digestion with EcoRI and HindIII. Purification of theband of about 1000 bp by the glass beads method using EASY-TRAP gave afragment for use as the OPF insert.

The S. pombe multicloning vector pTL2M (JP-A-7-163373) was subjected toagarose gel electrophoresis after terminal double restriction digestionwith SpeI and HindIII. Purification of the band of about 4500 bp by theglass beads method using EASY-TRAP gave a vector fragment.

The three fragments were ligated with a DNA ligation kit and transformedinto E. coli strain DH5. E. coli colonies were screened for theinducible lipocortin I expression vector pRI0LPI through base sequencingand restriction mapping, and the vector was recovered by the alkali-SDSmethod on a preparatory scale.

EXAMPLE 15 Preparation of Fission Yeast Schizosaccharomyces pombeTransformant ASP139

A S. pombe wild-type strain ARC001 was transformed with the induciblelipocortin I expression vector pRI0LPI and a transducing vector pAL7 asdescribed by Okazaki et al.

1 ml of a preincubated ARC001 culture in YPD medium was incubated inminimum medium MB+Leu with shaking at 30° C. for 16 hours to a celldensity of 1×10⁷ per 1 ml, and the cells were collected, washed withwater, suspended in 1 ml of 0.1M lithium acetate (pH 5.0) and incubatedat 30° C. for 60 minutes. A 100 μl portion of the suspension was mixedwith 2 μg of the recombinant vector pRI0LPI and 0.5 μg of PstI-digestedpAL7 in 15 μl TE and further with 290 μl of 50% PEG 4000 thoroughly andincubated at 30° C. for 60 minutes, at 43° C. for 15 minutes and at roomtemperature for 10 minutes, successively. After centrifugal removal ofPEG, the cells were suspended in 1 ml of 1/2 YEL+Leu medium. After10-fold dilution, 1 ml of the suspension was incubated at 32° C. for 2hours, and a 300 μl portion was spread on minimum medium agar MMA. After3 days of incubation at 32° C., about 300 independent colonies haddeveloped on the plate.

10 colonies of the transformant were inoculated in 2 ml of YEL mediumcontaining 10 μg/ml antibiotic G418 (YEL10 medium) and incubated withshaking at 32° C. 2 days later, 2 clones were viable. All thesubcultures of them were viable 3 days later. The putative desiredtransformant (ASP138 strain) was frozen in glycerol and stored for usein subsequent experiments.

EXAMPLE 16 Analysis of Lipocortin I Expression

S. pombe transformant ASP139 (Example 15) was incubated in YPD mediumcontaining 100 μg/ml G418 (glucose concentration 8%) at 32° C. to thestationary phase and collected as a non-inducible cell culture.Separately, ASP139 was incubated in the same medium at first to themid-growth phase, then after medium change, incubated in YPDG mediumcontaining 100 μg/ml G418 (glucose concentration 0.1%, glycerolconcentration 3%) to the stationary phase and collected as an induciblecell culture. Both cell cultures were washed, suspended in 50 mMtris-HCl (pH 7.5) and disrupted with glass beads in a mini bead beater.After removal of the glass beads, the cell extracts were heated in thepresence of SDS (1%) at 80° C. for 15 minutes.

50 μg protein from each extract was separated by SDS-polyacrylamide gelelectrophoresis and stained with Coomassie Brilliant Blue. The extractfrom the inducible cell culture showed a major band of a molecularweight of about 45000, which is the same as the deduced molecular weighof the recombinant lipocortin I protein, but the extract from thenon-inducible cell culture did not. More sensitive western analysis ofband density showed that the band from the inducible cell cultureextract was 10 times denser than the band from the non-inducible cellculture extract. The results show that while lipocortin I was notexpressed in the cells incubated without medium change, the expressionof lipocortin I in the cells incubated with the medium-change wasactivated by the medium change probably because the depletion of glucosein the medium provoked derepression of the invertase promoter andthereby induced the protein expression. The high level expressioninduced by the medium change to a low-glucose expression medium from ahigh-glucose medium suggests that use of a growth medium (with a highglucose concentration) and an expression medium (with a low glucoseconcentration) can differentiate between cell growth and proteinexpression. It was demonstrated that the repression of the inducibleinvertase promoter in the presence of glucose and derepression byexhaustion of glucose could be utilized in expression of lipocortin I asa heterologous protein.

EXAMPLE 17 Construction of Expression Vector pTL2INV1 Carrying InvertaseGene

PCR using plasmid pINV3000 (Example 1) carrying the S. pombe invertasegene as the template and oligo DNAs shown in SEQ ID NOS: 17 and 18 inthe Sequence Listing as primers gave an amplification product containingthe ORF in the invertase gene and having restriction enzyme recognitionsequences at both ends. After terminal double restriction digestion withAflIII (New England Biolab) and HindIII (Takara Shuzo Co., Ltd), theamplification product was subjected to agarose gel electrophoresis.Purification of the band of about 3000 bp by the glass beads methodusing EASY-TRAP gave a fragment for use as an insert.

The S. pombe multicloning vector pTL2M (JP-A-7-163373) was subjected toagarose gel electrophoresis after terminal double restriction digestionwith SpeI and EcoI. Purification of the band of about 4500 bp by theglass beads method using EASY-TRAP gave a vector fragment.

The two fragments were ligated with a DNA ligation kit (Takara ShuzoCo., Ltd) and transformed into E. coli strain DH5 (Toyobo Co., Ltd.). E.coli colonies were screened for the invertase gene expression vectorpRI0LPI through base sequencing and restriction mapping, and the vectorwas recovered multiplied by the alkali-SDS method on a preparatoryscale.

EXAMPLE 18 Construction of Secretory Expression Vector pSL2I06a'c1 Usingthe Signal Sequence from the Invertase Gene

PCR using plasmid pINV3000 (Example 1) carrying the S. pombe invertasegene as the template and oligo DNAs shown in SEQ ID NOS: 19 and 20 inthe Sequence Listing as primers gave an amplification product containingthe ORF in the invertase gene and having restriction enzyme recognitionsequences at both ends. After terminal double restriction digestion withSpeI (Takara Shuzo Co., Ltd.) and EcoRI (Takara Shuzo Co., Ltd.), theamplification product was subjected to agarose gel electrophoresis.Purification of the band of about 700 bp by the glass beads method usingEASY-TRAP (Takara Shuzo Co., Ltd.) gave a fragment for use as a signalinsert.

PCR using plasmid pSL2P06a'c1 (WO96/23890) containing human iterleukin6a'c1 variant cDNA as the template and oligo DNAs shown in SEQ ID NOS:21 and 22 in the Sequence Listing as primers gave an amplificationproduct containing the iterleukin 6a'c1 variant ORF. After terminaldouble restriction digestion with EcoRI and HindIII (Takara Shuzo Co.,Ltd), the amplification product was subjected to agarose gelelectrophoresis. Purification of the band of about 600 bp by the glassbeads method using EASY-TRAP (Takara Shuzo Co., Ltd.) gave a fragmentfor use as a gene insert.

The S. pombe multicloning vector pTL2M (JP-A-7-163373) was subjected toagarose gel electrophoresis after terminal double restriction digestionwith SpeI and HindIII. Purification of the band of about 4500 bp by theglass beads method using EASY-TRAP gave a vector fragment.

The three fragments were ligated with a DNA ligation kit and transformedintothe E. coli strain DH5 (Toyobo Co., Ltd.). E. coli colonies werescreened for the IL-6a'c1 secretory expression vector pSL2I06a'c1through base sequencing and restriction mapping, and the vector wasrecovered by the alkali-SDS method on a preparatory scale.

EXAMPLE 19 Preparation of S. pombe Transformant ASP168

A leucine-requiring S. pombe strain ARC001 was transformed with theinterleukin-6a'c1 variant secretory expression vector pSL2I06a'c1(Example 18) and a transducing vector pAL7 as described by Okazaki etal.

1 ml of a preincubated ARC001 culture in YPD medium was incubated in 100minimum medium MB+Leu with shaking at 30° C. for 16 hours. The cellswere collected, washed with water, suspended in 0.1M lithium acetate (pH5.0) at a cell density of 10⁹ cells/ml and incubated at 30° C. for 60minutes. A 100 μl portion of the suspension was mixed with 2 μg of therecombinant vector pSL2I06a'c1 and 1.0 μg of PstI-digested pAL7 in 15 μlTE and further with 290 μl of 50% PEG 4000 thoroughly and incubated at30° C. for 60 minutes, at 43° C. for 15 minutes and at room temperaturefor 10 minutes, successively. After centrifugal removal of PEG, thecells were suspended in 1 ml of 1/2 YEL+Leu medium. After 10-folddilution, 1 ml of the suspension was incubated at 32° C. for 2 hours,and a 300 μl portion was spread on minimum medium agar MMA. After 3 daysof incubation at 32° C., about 1000 independent colonies had developedon the plate.

The transformants (colonies) were inoculated in 2 ml of YEL mediumcontaining 10 μg/ml antibiotic G418 (YEL10 medium) and incubated withshaking at 32° C. for 5 days. The viable clones of the putative desiredtransformant, designated as strain ASP168, were frozen in glycerol andstored for use in subsequent experiments.

EXAMPLE 20 Analysis of Expressed Secretory Interleukin-6a'c1 Variant inCulture Medium

A fission yeast Schizosaccharomyces pombe transformant ASP168 (Example19) was incubated in MA-Casamino acid medium (MA medium containing 2%Casamino acid and 3% glucose; for the composition of MA medium, refer to"Alfa et al., Experiments with Fission Yeast, Cold Spring HarborLaboratory Press, 1993") containing 500 μg/ml G418 at 32° C. for 2 days.

The cell culture was centrifuged, and the supernatant was concentrated100-fold through a membrane filter (Amicon Co., Ltd.). Analysis of theconcentrated sample by SDS-polyacrylamide electrophoresis followed byCoomassie Brilliant Blue staining gave the SDS-PAGE pattern shown inFIG. 15. Lane 1 is the purified interleukin-6a'c1 variant, lane 2 is thesupernatant from the ASP168 cell culture, and lane 3 is the supernatantfrom a cell culture of the control strain ASP021 [transformant carryinga recombinant vector with no ORF prepared by recombination of pTL2M(JP-A-7-163373) by the method disclosed in JP-A-7-163373 withoutintroduction of any gene to be expressed]. The band with a molecularweight of about 20000 in lane 3 seemed attributable to theinterleukin-6a'c1 variant from the comparison of lanes 1 and 3.

Further analysis by western blotting using an anti-IL-6a'c1 gave thepattern shown in FIG. 16. Lane 1 is the purified interleukin-6a'c1variant, lane 2 is the supernatant from the ASP168 cell culture, andlane 3 is the supernatant from a cell culture of the control strainASP021. The band with a molecular weight of about 20000 in lane 3 wasidentified as the interleukin-6a'c1 variant from the comparison of lanes1 and 3.

    __________________________________________________________________________    #             SEQUENCE LISTING                                                - <160> NUMBER OF SEQ ID NOS: 24                                              - <210> SEQ ID NO 1                                                           <211> LENGTH: 4748                                                            <212> TYPE: DNA                                                               <213> ORGANISM: Schizosaccharomyces pombe                                     <220> FEATURE:                                                                <221> NAME/KEY: CDS                                                           <222> LOCATION: (2810)..(4552)                                                - <400> SEQUENCE: 1                                                           - ggatcctagt ccgcgaaatc gagatgcttt gaagattaaa attaaattta at - #tttatgcg         60                                                                          - agactggttt ccttattttt tgtatagtcg catgcaagcg aggttcgcat aa - #tttggaaa        120                                                                          - ataaaggtag tcaagaagac gttgaattaa ggctgcagtt tcaaagtact ct - #acaaacga        180                                                                          - ttccttttaa aaaaaaagat tcaaaaaaaa ggcaaagggt ttaagtaatg ct - #tgttattt        240                                                                          - caatttacct ccaaacagtt actaatgcaa ttgcaaaaaa aaaacctacc ta - #ttgaatca        300                                                                          - aaatttctag cccatccatc gctcctcaag ataaaggaat cgatattttg ag - #tttaaggg        360                                                                          - agttgctgat agatttcaga attaaaaatt tttggaaaag gatgtcgaga ac - #aagaagat        420                                                                          - acgtctagat tgctgatgat gcattctagc agacggaaat acaacgatat gt - #ggacagca        480                                                                          - cgacttttga tccgttcgga tcaaaaggaa gagaaatatc catctttcaa ga - #agaatgca        540                                                                          - ggaaaagcaa taaatgccca tttgattcct aaattatccc caaaaatgaa ca - #ttatgaga        600                                                                          - tcttcttgtg ggagacagga aatttcgcaa ttccaaacga aaattcggct ct - #ttttttta        660                                                                          - ccccacagtt gcggggtaaa tgatgtaacg gaccttgggg gaaaggatga tg - #agttagtt        720                                                                          - gggaagcgga aaaaatggaa aacggaagta agaatagaaa ccagtatggc tg - #agtgcaat        780                                                                          - ggcggaaaag attttacaga gatgacaaga atctatttat ctataaggaa aa - #actttttc        840                                                                          - caaatttgtc taaaaacgca ttctcctcaa ttgcctctag gtagatgata ta - #acgaattg        900                                                                          - gaacgagaca tcgctaaccg gttttctttg taaatgacat tttgtagtgg ga - #gtaagttt        960                                                                          - gaatggaggg atagacagat gaatagtatg agatagaaga atagtatata ta - #atgattaa       1020                                                                          - gatgaacaaa taaaaattga aagaaaaaag aaattgttgg ctcatttggt tc - #atacacat       1080                                                                          - gttggttcat acaactttta cccatcgtaa gtattataag taaaaaatag ag - #tacgaaaa       1140                                                                          - gctataagta gtgaagcaaa aaaatagaaa aatagaaaaa aaaatatata ta - #aaaaaata       1200                                                                          - taataaaaat aaaactcata agagacgtaa aacacaagaa ttgtctatca tt - #tgttcttt       1260                                                                          - aagaagcacc accattctgt aaaactcttc atttctcatt agcaaggacc ct - #tttcattc       1320                                                                          - cttcctcttt agaatccttt tcattataac gaattggata atacgcaaat aa - #gaacacat       1380                                                                          - cccctaaata cgatatatcg atccattttt tactttgcct agcttattgc tg - #tacaattc       1440                                                                          - catttaaata gtttctcctc aagaaagatc gtcaatggag gcgacaatat ac - #cggaattt       1500                                                                          - aagttgcgga cacagagctt gaaaagactg cattttgtat tgttttcaag ta - #aatgaaac       1560                                                                          - tgagttttga agtctcaaaa tacatcttat gtattgaaca ttagaagaac at - #ataagata       1620                                                                          - gatcttgaga gctcaattca tcgacattct agccatcata ctgcgatctt ag - #acattgtc       1680                                                                          - agcacaacct tagatcgaaa atgaacacgt taccaaacgt tgtctaaaac tt - #gccgaatc       1740                                                                          - ttatctccgc attacttccg taatccttag tacatacgct gcaatttcgg aa - #ggtcatga       1800                                                                          - tcgacttttt gtgtagctat aagtgacgca aatgagaaac atgacaaggt gc - #gatattta       1860                                                                          - gcaagatatt atgcatttga tggagaaagg aaatttcgga tgtatatata gt - #accgttag       1920                                                                          - ctgcgctttt tttggtcatc cataattttc aaactcactg ctttcgatca ga - #tttaccgt       1980                                                                          - ttttaaggtc tttattgctt tgtgatctgt aggttggaac atctatagtt ca - #ttttctaa       2040                                                                          - aagatccttt catcgtttca tcggatagta atcgttcaag aaaaaaaaag aa - #aaaaagaa       2100                                                                          - aaagaaaaag aaaagaaaaa taaaccgcta taattcatta cctatttgac tg - #aaggttct       2160                                                                          - tcatcttgaa ttgttttgaa tcaaaataaa gaaattatta ttattatttt tt - #ttcttcgc       2220                                                                          - tttttcttta tccattcgtc gaaactattt ttctgctgat aaaagcaatc at - #tccttttt       2280                                                                          - cctgcttctc ttgttattcg aattttaaac gacttttttt cctcgtccat tc - #cctaattc       2340                                                                          - tttgcgacct tttctgattc tatccttggt ttgtactttc gttgtgtaat tg - #ttgagaaa       2400                                                                          - gtgaactgat tatttaattg ttgtgaaaaa aattctaaaa ctattttgtt tt - #tcttgatc       2460                                                                          - attcatcctt tgctcgcttg cttgaatatt acagaaattc gtctcccttt ca - #acggaata       2520                                                                          - tgataatttg ttgaatactc taaatcaatt aacacctatc aaaagctgaa ac - #attaaatc       2580                                                                          - tattctcacc aaaaaaaaag actcaagctt cttcgttgtt ggccggtctc tt - #ttttgttt       2640                                                                          - tacgattgtt aaattttata ctcacaactg ccaattctcc acttttgact at - #ttattgat       2700                                                                          - agtccctatt taattttctg ttcaccgatt atcgtctttt ttgtaaataa tc - #tttcttgg       2760                                                                          - aaccaaccaa ttaatacgtt ataatcgcta actttgaaga tttgctaca atg - # ttt ttg       2818                                                                          #Met Phe Leu                                                                  #  1                                                                          - aaa tat att tta gct agt ggc att tgc ctc gt - #c tct ctc tta tca tct         2866                                                                          Lys Tyr Ile Leu Ala Ser Gly Ile Cys Leu Va - #l Ser Leu Leu Ser Ser           #      15                                                                     - aca aac gcg gct ccc cgt cac tta tat gta aa - #a cgt tat cct gtc att         2914                                                                          Thr Asn Ala Ala Pro Arg His Leu Tyr Val Ly - #s Arg Tyr Pro Val Ile           # 35                                                                          - tat aat gct tcc aac atc act gaa gtc agc aa - #t tct acc acc gtt cct         2962                                                                          Tyr Asn Ala Ser Asn Ile Thr Glu Val Ser As - #n Ser Thr Thr Val Pro           #                 50                                                          - cct cct cca ttc gta aat aca acg gcc cct aa - #t ggg act tgt ttg ggt         3010                                                                          Pro Pro Pro Phe Val Asn Thr Thr Ala Pro As - #n Gly Thr Cys Leu Gly           #             65                                                              - aac tat aac gag tat ctt cct tca gga tat ta - #c aat gct acc gat cgt         3058                                                                          Asn Tyr Asn Glu Tyr Leu Pro Ser Gly Tyr Ty - #r Asn Ala Thr Asp Arg           #         80                                                                  - ccc aaa att cat ttt act cct tct tcc ggt tt - #c atg aat gat cca aac         3106                                                                          Pro Lys Ile His Phe Thr Pro Ser Ser Gly Ph - #e Met Asn Asp Pro Asn           #     95                                                                      - gga ttg gta tat act ggc ggc gtc tat cac at - #g ttc ttc caa tat tca         3154                                                                          Gly Leu Val Tyr Thr Gly Gly Val Tyr His Me - #t Phe Phe Gln Tyr Ser           100                 1 - #05                 1 - #10                 1 -       #15                                                                           - cca aaa act cta aca gcc ggc gaa gtt cat tg - #g ggt cac aca gtt tcc         3202                                                                          Pro Lys Thr Leu Thr Ala Gly Glu Val His Tr - #p Gly His Thr Val Ser           #               130                                                           - aag gat tta atc cat tgg gag aat tat cct at - #t gcc atc tac ccc gat         3250                                                                          Lys Asp Leu Ile His Trp Glu Asn Tyr Pro Il - #e Ala Ile Tyr Pro Asp           #           145                                                               - gaa cat gaa aac gga gtt ttg tcc ctc cca tt - #t agt ggc agt gca gtc         3298                                                                          Glu His Glu Asn Gly Val Leu Ser Leu Pro Ph - #e Ser Gly Ser Ala Val           #       160                                                                   - gtc gat gtt cat aat tct tcc ggt ctc ttt tc - #c aac gac acc att ccg         3346                                                                          Val Asp Val His Asn Ser Ser Gly Leu Phe Se - #r Asn Asp Thr Ile Pro           #   175                                                                       - gaa gag cgc att gtt tta att tat acc gat ca - #t tgg act ggt gtt gct         3394                                                                          Glu Glu Arg Ile Val Leu Ile Tyr Thr Asp Hi - #s Trp Thr Gly Val Ala           180                 1 - #85                 1 - #90                 1 -       #95                                                                           - gag cgt cag gct att gcg tat acc act gat gg - #t gga tat act ttc aaa         3442                                                                          Glu Arg Gln Ala Ile Ala Tyr Thr Thr Asp Gl - #y Gly Tyr Thr Phe Lys           #               210                                                           - aaa tat tca gga aat ccc gtt ctt gac att aa - #t tca ctt caa ttc cgc         3490                                                                          Lys Tyr Ser Gly Asn Pro Val Leu Asp Ile As - #n Ser Leu Gln Phe Arg           #           225                                                               - gac ccc aag gta ata tgg gat ttc gat gct aa - #t cgt tgg gtg atg att         3538                                                                          Asp Pro Lys Val Ile Trp Asp Phe Asp Ala As - #n Arg Trp Val Met Ile           #       240                                                                   - gta gct atg tct caa aat tat gga att gcc tt - #t tat tcc tcc tat gac         3586                                                                          Val Ala Met Ser Gln Asn Tyr Gly Ile Ala Ph - #e Tyr Ser Ser Tyr Asp           #   255                                                                       - ttg att cac tgg acc gag tta tct gtt ttc tc - #c act tct ggt tat ttg         3634                                                                          Leu Ile His Trp Thr Glu Leu Ser Val Phe Se - #r Thr Ser Gly Tyr Leu           260                 2 - #65                 2 - #70                 2 -       #75                                                                           - ggg ttg caa tat gaa tgc cct gga atg gct cg - #t gtg ccc gtt gaa ggc         3682                                                                          Gly Leu Gln Tyr Glu Cys Pro Gly Met Ala Ar - #g Val Pro Val Glu Gly           #               290                                                           - acc gat gaa tac aaa tgg gta ctc ttc atc tc - #c atc aat cct ggc gct         3730                                                                          Thr Asp Glu Tyr Lys Trp Val Leu Phe Ile Se - #r Ile Asn Pro Gly Ala           #           305                                                               - cca ttg gga gga tcc gtt gtc caa tac ttt gt - #t ggc gat tgg aat ggt         3778                                                                          Pro Leu Gly Gly Ser Val Val Gln Tyr Phe Va - #l Gly Asp Trp Asn Gly           #       320                                                                   - aca aac ttc gtc ccc gat gat ggc caa act ag - #a ttc gta gac ttg ggt         3826                                                                          Thr Asn Phe Val Pro Asp Asp Gly Gln Thr Ar - #g Phe Val Asp Leu Gly           #   335                                                                       - aag gac ttt tac gcc agc gct ttg tat cac tc - #g tct tcc gcc aat gcc         3874                                                                          Lys Asp Phe Tyr Ala Ser Ala Leu Tyr His Se - #r Ser Ser Ala Asn Ala           340                 3 - #45                 3 - #50                 3 -       #55                                                                           - gat gtt att gga gtt gga tgg gct agc aac tg - #g caa tac acc aac caa         3922                                                                          Asp Val Ile Gly Val Gly Trp Ala Ser Asn Tr - #p Gln Tyr Thr Asn Gln           #               370                                                           - gct cct act caa gtt ttc cgc agt gct atg ac - #a gtt gca cga aaa ttc         3970                                                                          Ala Pro Thr Gln Val Phe Arg Ser Ala Met Th - #r Val Ala Arg Lys Phe           #           385                                                               - act ctt cgc gac gtt cct cag aac ccc atg ac - #c aac ctt act tct ctc         4018                                                                          Thr Leu Arg Asp Val Pro Gln Asn Pro Met Th - #r Asn Leu Thr Ser Leu           #       400                                                                   - att caa acc cca ttg aat gtt tct ctc tta cg - #a gat gaa aca cta ttt         4066                                                                          Ile Gln Thr Pro Leu Asn Val Ser Leu Leu Ar - #g Asp Glu Thr Leu Phe           #   415                                                                       - acc gca ccc gtt atc aat agt tca agt agt ct - #t tcg ggc tct ccg att         4114                                                                          Thr Ala Pro Val Ile Asn Ser Ser Ser Ser Le - #u Ser Gly Ser Pro Ile           420                 4 - #25                 4 - #30                 4 -       #35                                                                           - act ctt cca agc aat acc gca ttc gag ttc aa - #t gtc aca ctc agt atc         4162                                                                          Thr Leu Pro Ser Asn Thr Ala Phe Glu Phe As - #n Val Thr Leu Ser Ile           #               450                                                           - aat tac aca gaa ggc tgc aca aca gga tat tg - #t ctg ggg cgt att atc         4210                                                                          Asn Tyr Thr Glu Gly Cys Thr Thr Gly Tyr Cy - #s Leu Gly Arg Ile Ile           #           465                                                               - att gat tct gat gat cca tac aga tta caa tc - #c atc tcc gtg gac gtt         4258                                                                          Ile Asp Ser Asp Asp Pro Tyr Arg Leu Gln Se - #r Ile Ser Val Asp Val           #       480                                                                   - gat ttt gca gct agc act tta gtc att aat cg - #t gcc aaa gct cag atg         4306                                                                          Asp Phe Ala Ala Ser Thr Leu Val Ile Asn Ar - #g Ala Lys Ala Gln Met           #   495                                                                       - gga tgg ttt aat tca ctt ttc acg cct tct tt - #t gcc aac gat att tac         4354                                                                          Gly Trp Phe Asn Ser Leu Phe Thr Pro Ser Ph - #e Ala Asn Asp Ile Tyr           500                 5 - #05                 5 - #10                 5 -       #15                                                                           - att tat gga aac gta act ttg tat ggt att gt - #t gac aat gga ttg ctt         4402                                                                          Ile Tyr Gly Asn Val Thr Leu Tyr Gly Ile Va - #l Asp Asn Gly Leu Leu           #               530                                                           - gaa ctg tat gtc aat aat ggc gaa aaa act ta - #c act aat gac ttt ttc         4450                                                                          Glu Leu Tyr Val Asn Asn Gly Glu Lys Thr Ty - #r Thr Asn Asp Phe Phe           #           545                                                               - ttc ctt caa gga gca aca cct gga cag atc ag - #c ttc gct gct ttc caa         4498                                                                          Phe Leu Gln Gly Ala Thr Pro Gly Gln Ile Se - #r Phe Ala Ala Phe Gln           #       560                                                                   - ggc gtt tct ttc aat aat gtt acc gtt acg cc - #a tta aag act atc tgg         4546                                                                          Gly Val Ser Phe Asn Asn Val Thr Val Thr Pr - #o Leu Lys Thr Ile Trp           #   575                                                                       - aat tgc taaatatttt gtttcaagtt aggaaagtat aataactttt gt - #ccctgcat          4602                                                                          Asn Cys                                                                       580                                                                           - attcaattgt aaagtttagt ttatcctttc atcgtaacca caattgtcac ct - #aaatctct       4662                                                                          - aaaaatctct tcacttatct agttaatgtc gtaacaaaaa agtccagtag ct - #tcgggaaa       4722                                                                          #            4748  acaa gtcgac                                                - <210> SEQ ID NO 2                                                           <211> LENGTH: 581                                                             <212> TYPE: PRT                                                               <213> ORGANISM: Schizosaccharomyces pombe                                     - <400> SEQUENCE: 2                                                           - Met Phe Leu Lys Tyr Ile Leu Ala Ser Gly Il - #e Cys Leu Val Ser Leu         #                 15                                                          - Leu Ser Ser Thr Asn Ala Ala Pro Arg His Le - #u Tyr Val Lys Arg Tyr         #             30                                                              - Pro Val Ile Tyr Asn Ala Ser Asn Ile Thr Gl - #u Val Ser Asn Ser Thr         #         45                                                                  - Thr Val Pro Pro Pro Pro Phe Val Asn Thr Th - #r Ala Pro Asn Gly Thr         #     60                                                                      - Cys Leu Gly Asn Tyr Asn Glu Tyr Leu Pro Se - #r Gly Tyr Tyr Asn Ala         # 80                                                                          - Thr Asp Arg Pro Lys Ile His Phe Thr Pro Se - #r Ser Gly Phe Met Asn         #                 95                                                          - Asp Pro Asn Gly Leu Val Tyr Thr Gly Gly Va - #l Tyr His Met Phe Phe         #           110                                                               - Gln Tyr Ser Pro Lys Thr Leu Thr Ala Gly Gl - #u Val His Trp Gly His         #       125                                                                   - Thr Val Ser Lys Asp Leu Ile His Trp Glu As - #n Tyr Pro Ile Ala Ile         #   140                                                                       - Tyr Pro Asp Glu His Glu Asn Gly Val Leu Se - #r Leu Pro Phe Ser Gly         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Ser Ala Val Val Asp Val His Asn Ser Ser Gl - #y Leu Phe Ser Asn Asp         #               175                                                           - Thr Ile Pro Glu Glu Arg Ile Val Leu Ile Ty - #r Thr Asp His Trp Thr         #           190                                                               - Gly Val Ala Glu Arg Gln Ala Ile Ala Tyr Th - #r Thr Asp Gly Gly Tyr         #       205                                                                   - Thr Phe Lys Lys Tyr Ser Gly Asn Pro Val Le - #u Asp Ile Asn Ser Leu         #   220                                                                       - Gln Phe Arg Asp Pro Lys Val Ile Trp Asp Ph - #e Asp Ala Asn Arg Trp         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Val Met Ile Val Ala Met Ser Gln Asn Tyr Gl - #y Ile Ala Phe Tyr Ser         #               255                                                           - Ser Tyr Asp Leu Ile His Trp Thr Glu Leu Se - #r Val Phe Ser Thr Ser         #           270                                                               - Gly Tyr Leu Gly Leu Gln Tyr Glu Cys Pro Gl - #y Met Ala Arg Val Pro         #       285                                                                   - Val Glu Gly Thr Asp Glu Tyr Lys Trp Val Le - #u Phe Ile Ser Ile Asn         #   300                                                                       - Pro Gly Ala Pro Leu Gly Gly Ser Val Val Gl - #n Tyr Phe Val Gly Asp         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Trp Asn Gly Thr Asn Phe Val Pro Asp Asp Gl - #y Gln Thr Arg Phe Val         #               335                                                           - Asp Leu Gly Lys Asp Phe Tyr Ala Ser Ala Le - #u Tyr His Ser Ser Ser         #           350                                                               - Ala Asn Ala Asp Val Ile Gly Val Gly Trp Al - #a Ser Asn Trp Gln Tyr         #       365                                                                   - Thr Asn Gln Ala Pro Thr Gln Val Phe Arg Se - #r Ala Met Thr Val Ala         #   380                                                                       - Arg Lys Phe Thr Leu Arg Asp Val Pro Gln As - #n Pro Met Thr Asn Leu         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Thr Ser Leu Ile Gln Thr Pro Leu Asn Val Se - #r Leu Leu Arg Asp Glu         #               415                                                           - Thr Leu Phe Thr Ala Pro Val Ile Asn Ser Se - #r Ser Ser Leu Ser Gly         #           430                                                               - Ser Pro Ile Thr Leu Pro Ser Asn Thr Ala Ph - #e Glu Phe Asn Val Thr         #       445                                                                   - Leu Ser Ile Asn Tyr Thr Glu Gly Cys Thr Th - #r Gly Tyr Cys Leu Gly         #   460                                                                       - Arg Ile Ile Ile Asp Ser Asp Asp Pro Tyr Ar - #g Leu Gln Ser Ile Ser         465                 4 - #70                 4 - #75                 4 -       #80                                                                           - Val Asp Val Asp Phe Ala Ala Ser Thr Leu Va - #l Ile Asn Arg Ala Lys         #               495                                                           - Ala Gln Met Gly Trp Phe Asn Ser Leu Phe Th - #r Pro Ser Phe Ala Asn         #           510                                                               - Asp Ile Tyr Ile Tyr Gly Asn Val Thr Leu Ty - #r Gly Ile Val Asp Asn         #       525                                                                   - Gly Leu Leu Glu Leu Tyr Val Asn Asn Gly Gl - #u Lys Thr Tyr Thr Asn         #   540                                                                       - Asp Phe Phe Phe Leu Gln Gly Ala Thr Pro Gl - #y Gln Ile Ser Phe Ala         545                 5 - #50                 5 - #55                 5 -       #60                                                                           - Ala Phe Gln Gly Val Ser Phe Asn Asn Val Th - #r Val Thr Pro Leu Lys         #               575                                                           - Thr Ile Trp Asn Cys                                                                     580                                                               - <210> SEQ ID NO 3                                                           <211> LENGTH: 7286                                                            <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 3                                                           - agcttgaaaa aacctcccac acctccccct gaacctgaaa cataaaatga at - #gcaattgt         60                                                                          - tgttgttaac ttgtttattg cagcttataa tggttacaaa taaagcaata gc - #atcacaaa        120                                                                          - tttcacaaat aaagcatttt tttcactgca ttctagttgt ggtttgtcca aa - #ctcatcaa        180                                                                          - tgtatcttat catgtctgga tcgatcccgg caggttgggc gtcgcttggt cg - #gtcatttc        240                                                                          - gaaccccaga gtcccgctca gaagaactcg tcaagaaggc gatagaaggc ga - #tgcgctgc        300                                                                          - gaatcgggag cggcgatacc gtaaagcacg aggaagcggt cagcccattc gc - #cgccaagc        360                                                                          - tcttcagcaa tatcacgggt agccaacgct atgtcctgat agcggtccgc ca - #cacccagc        420                                                                          - cggccacagt cgatgaatcc agaaaagcgg ccattttcca ccatgatatt cg - #gcaagcag        480                                                                          - gcatcgccat gggtcacgac gagatcctcg ccgtcgggca tgcgcgcctt ga - #gcctggcg        540                                                                          - aacagttcgg ctggcgcgag cccctgatgc tcttcgtcca gatcatcctg at - #cgacaaga        600                                                                          - ccggcttcca tccgagtacg tgctcgctcg atgcgatgtt tcgcttggtg gt - #cgaatggg        660                                                                          - caggtagccg gatcaagcgt atgcagccgc cgcattgcat cagccatgat gg - #atactttc        720                                                                          - tcggcaggag caaggtgaga tgacaggaga tcctgccccg gcacttcgcc ca - #atagcagc        780                                                                          - cagtcccttc ccgcttcagt gacaacgtcg agcacagctg cgcaaggaac gc - #ccgtcgtg        840                                                                          - gccagccacg atagccgcgc tgcctcgtcc tgcagttcat tcagggcacc gg - #acaggtcg        900                                                                          - gtcttgacaa aaagaaccgg gcgcccctgc gctgacagcc ggaacacggc gg - #catcagag        960                                                                          - cagccgattg tctgttgtgc ccagtcatag ccgaatagcc tctccaccca ag - #cggccgga       1020                                                                          - gaacctgcgt gcaatccatc ttgttcaatc atgcgaaacg atcctcatcc tg - #tctcttga       1080                                                                          - tcagatccgg gacctgaaat aaaagacaaa aagactaaac ttaccagtta ac - #tttctggt       1140                                                                          - ttttcagttc ctcgaggagc tttttgcaaa agcctaggcc tccaaaaaag cc - #tcctcact       1200                                                                          - acttctggaa tagctcagag gccgaggcgg cctcggcctc tgcataaata aa - #aaaaatta       1260                                                                          - gtcagccatg gggcggagaa tgggcggaac tgggcggagt taggggcggg at - #gggcggag       1320                                                                          - ttaggggcgg gactatggtt gctgactaat tgagatgcat gctttgcata ct - #tctgcctg       1380                                                                          - ctggggagcc tggggacttt ccacacctgg ttgctgacta attgagatgc at - #gctttgca       1440                                                                          - tacttctgcc tgctggggag cctggggact ttccacaccc taactgacac ac - #attccaca       1500                                                                          - ggacattgat tattgactag ttagtccgcg aaatcgagat gctttgaaga tt - #aaaattaa       1560                                                                          - atttaatttt atgcgagact ggtttcctta ttttttgtat agtcgcatgc aa - #gcgaggtt       1620                                                                          - cgcataattt ggaaaataaa ggtagtcaag aagacgttga attaaggctg ca - #gtttcaaa       1680                                                                          - gtactctaca aacgattcct tttaaaaaaa aagattcaaa aaaaaggcaa ag - #ggtttaag       1740                                                                          - taatgcttgt tatttcaatt tacctccaaa cagttactaa tgcaattgca aa - #aaaaaaac       1800                                                                          - ctacctattg aatcaaaatt tctagcccat ccatcgctcc tcaagataaa gg - #aatcgata       1860                                                                          - ttttgagttt aagggagttg ctgatagatt tcagaattaa aaatttttgg aa - #aaggatgt       1920                                                                          - cgagaacaag aagatacgtc tagattgctg atgatgcatt ctagcagacg ga - #aatacaac       1980                                                                          - gatatgtgga cagcacgact tttgatccgt tcggatcaaa aggaagagaa at - #atccatct       2040                                                                          - ttcaagaaga atgcaggaaa agcaataaat gcccatttga ttcctaaatt at - #ccccaaaa       2100                                                                          - atgaacatta tgagatcttc ttgtgggaga caggaaattt cgcaattcca aa - #cgaaaatt       2160                                                                          - cggctctttt ttttacccca cagttgcggg gtaaatgatg taacggacct tg - #ggggaaag       2220                                                                          - gatgatgagt tagttgggaa gcggaaaaaa tggaaaacgg aagtaagaat ag - #aaaccagt       2280                                                                          - atggctgagt gcaatggcgg aaaagatttt acagagatga caagaatcta tt - #tatctata       2340                                                                          - aggaaaaact ttttccaaat ttgtctaaaa acgcattctc ctcaattgcc tc - #taggtaga       2400                                                                          - tgatataacg aattggaacg agacatcgct aaccggtttt ctttgtaaat ga - #cattttgt       2460                                                                          - agtgggagta agtttgaatg gagggataga cagatgaata gtatgagata ga - #agaatagt       2520                                                                          - atatataatg attaagatga acaaataaaa attgaaagaa aaaagaaatt gt - #tggctcat       2580                                                                          - ttggttcata cacatgttgg ttcatacaac ttttacccat cgtaagtatt at - #aagtaaaa       2640                                                                          - aatagagtac gaaaagctat aagtagtgaa gcaaaaaaat agaaaaatag aa - #aaaaaaat       2700                                                                          - atatataaaa aaatataata aaaataaaac tcataagaga cgtaaaacac aa - #gaattgtc       2760                                                                          - tatcatttgt tctttaagaa gcaccaccat tctgtaaaac tcttcatttc tc - #attagcaa       2820                                                                          - ggaccctttt cattccttcc tctttagaat ccttttcatt ataacgaatt gg - #ataatacg       2880                                                                          - caaataagaa cacatcccct aaatacgata tatcgatcca ttttttactt tg - #cctagctt       2940                                                                          - attgctgtac aattccattt aaatagtttc tcctcaagaa agatcgtcaa tg - #gaggcgac       3000                                                                          - aatataccgg aatttaagtt gcggacacag agcttgaaaa gactgcattt tg - #tattgttt       3060                                                                          - tcaagtaaat gaaactgagt tttgaagtct caaaatacat cttatgtatt ga - #acattaga       3120                                                                          - agaacatata agatagatct tgagagctca attcatcgac attctagcca tc - #atactgcg       3180                                                                          - atcttagaca ttgtcagcac aaccttagat cgaaaatgaa cacgttacca aa - #cgttgtct       3240                                                                          - aaaacttgcc gaatcttatc tccgcattac ttccgtaatc cttagtacat ac - #gctgcaat       3300                                                                          - ttcggaaggt catgatcgac tttttgtgta gctataagtg acgcaaatga ga - #aacatgac       3360                                                                          - aaggtgcgat atttagcaag atattatgca tttgatggag aaaggaaatt tc - #ggatgtat       3420                                                                          - atatagtacc gttagctgcg ctttttttgg tcatccataa ttttcaaact ca - #ctgctttc       3480                                                                          - gatcagattt accgttttta aggtctttat tgctttgtga tctgtaggtt gg - #aacatcta       3540                                                                          - tagttcattt tctaaaagat cctttcatcg tttcatcgga tagtaatcgt tc - #aagaaaaa       3600                                                                          - aaaagaaaaa aagaaaaaga aaaagaaaag aaaaataaac cgctataatt ca - #ttacctat       3660                                                                          - ttgactgaag gttcttcatc ttgaattgtt ttgaatcaaa ataaagaaat ta - #ttattatt       3720                                                                          - attttttttc ttcgcttttt ctttatccat tcgtcgaaac tatttttctg ct - #gataaaag       3780                                                                          - caatcattcc tttttcctgc ttctcttgtt attcgaattt taaacgactt tt - #tttcctcg       3840                                                                          - tccattccct aattctttgc gaccttttct gattctatcc ttggtttgta ct - #ttcgttgt       3900                                                                          - gtaattgttg agaaagtgaa ctgattattt aattgttgtg aaaaaaattc ta - #aaactatt       3960                                                                          - ttgtttttct tgatcattca tcctttgctc gcttgcttga atattacaga aa - #ttcgtctc       4020                                                                          - cctttcaacg gaatatgata atttgttgaa tactctaaat caattaacac ct - #atcaaaag       4080                                                                          - ctgaaacatt aaatctattc tcaccaaaaa aaaagactca agcttcttcg tt - #gttggccg       4140                                                                          - gtctcttttt tgttttacga ttgttaaatt ttatactcac aactgccaat tc - #tccacttt       4200                                                                          - tgactattta ttgatagtcc ctatttaatt ttctgttcac cgattatcgt ct - #tttttgta       4260                                                                          - aataatcttt cttggaacca accaattaat acgttataat cgctaacttt ga - #agatttgc       4320                                                                          - tacaatggca atggtatcag aattcgagct cggtacccgg ggatcctcta ga - #gtcgacct       4380                                                                          - gcaggcatgc aagcttaaat aggaaagttt cttcaacagg attacagtgt ag - #ctacctac       4440                                                                          - atgctgaaaa atatagcctt taaatcattt ttatattata actctgtata at - #agagataa       4500                                                                          - gtccattttt taaaaatgtt ttccccaaac cataaaaccc tatacaagtt gt - #tctagtaa       4560                                                                          - caatacatga gaaagatgtc tatgtagctg aaaataaaat gacgtcacaa ga - #caaaaaaa       4620                                                                          - aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aa - #aaaaagta       4680                                                                          - ccttctgagg cggaaagaac cagccggatc cagacatgat aagatacatt ga - #tgagtttg       4740                                                                          - gacaaaccac aactagaatg cagtgaaaaa aatgctttat ttgtgaaatt tg - #tgatgcta       4800                                                                          - ttgctttatt tgtaaccatt ataagctgca ataaacaagt taacaacaac aa - #ttgcattc       4860                                                                          - attttatgtt tcaggttcag ggggaggtgt gggaggtttt ttaaagcaag ta - #aaacctct       4920                                                                          - acaaatgtgg tatggctgat tatgatccgg ctgcctcgcg cgtttcggtg at - #gacggtga       4980                                                                          - aaacctctga cacatgcagc tcccggagac ggtcacagct tgtctgtaag cg - #gatgccgg       5040                                                                          - gagcagacaa gcccgtcagg gcgcgtcagc gggtgttggc gggtgtcggg gc - #gcagccat       5100                                                                          - gacccagtca cgtagcgata gcggagtgta tactggctta actatgcggc at - #cagagcag       5160                                                                          - attgtactga gagtgcacca tatgcggtgt gaaataccgc acagatgcgt aa - #ggagaaaa       5220                                                                          - taccgcatca ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc gg - #tcgttcgg       5280                                                                          - ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac ag - #aatcaggg       5340                                                                          - gataacgcag gaaagaatga gcaaaaggcc agcaaaaggc caggaaccgt aa - #aaaggccg       5400                                                                          - cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa aa - #tcgacgct       5460                                                                          - caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt cc - #ccctggaa       5520                                                                          - gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg tc - #cgcctttc       5580                                                                          - tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc ag - #ttcggtgt       5640                                                                          - aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc ga - #ccgctgcg       5700                                                                          - ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta tc - #gccactgg       5760                                                                          - cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct ac - #agagttct       5820                                                                          - tgaagtggtg gcctaactac ggctacacta gaaggacagt atttggtatc tg - #cgctctgc       5880                                                                          - tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa ca - #aaccaccg       5940                                                                          - ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aa - #aggatctc       6000                                                                          - aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa aa - #ctcacgtt       6060                                                                          - aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt tt - #aaattaaa       6120                                                                          - aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac ag - #ttaccaat       6180                                                                          - gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc at - #agttgcct       6240                                                                          - gactccccgt cgtgtagata actacgatac gggagggctt accatctggc cc - #cagtgctg       6300                                                                          - caatgatacc gcgagaccca cgctcaccgg ctccagattt atcagcaata aa - #ccagccag       6360                                                                          - ccggaagggc cgagcgcaga agtggtcctg caactttatc cgcctccatc ca - #gtctatta       6420                                                                          - attgttgccg ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aa - #cgttgttg       6480                                                                          - ccattgctgc aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca tt - #cagctccg       6540                                                                          - gttcccaacg atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gc - #ggttagct       6600                                                                          - ccttcggtcc tccgatcgtt gtcagaagta agttggccgc agtgttatca ct - #catggtta       6660                                                                          - tggcagcact gcataattct cttactgtca tgccatccgt aagatgcttt tc - #tgtgactg       6720                                                                          - gtgagtactc aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tg - #ctcttgcc       6780                                                                          - cggcgtcaac acgggataat accgcgccac atagcagaac tttaaaagtg ct - #catcattg       6840                                                                          - gaaaacgttc ttcggggcga aaactctcaa ggatcttacc gctgttgaga tc - #cagttcga       6900                                                                          - tgtaacccac tcgtgcaccc aactgatctt cagcatcttt tactttcacc ag - #cgtttctg       6960                                                                          - ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg aataagggcg ac - #acggaaat       7020                                                                          - gttgaatact catactcttc ctttttcaat attattgaag catttatcag gg - #ttattgtc       7080                                                                          - tcatgagcgg atacatattt gaatgtattt agaaaaataa acaaataggg gt - #tccgcgca       7140                                                                          - catttccccg aaaagtgcca cctgacgtct aagaaaccat tattatcatg ac - #attaacct       7200                                                                          - ataaaaatag gcgtatcacg aggccctttc gtcttcaaga attggtcgac ca - #attctcat       7260                                                                          #            7286  tcga taagct                                                - <210> SEQ ID NO 4                                                           <211> LENGTH: 27                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 4                                                           #             27   gayg gnaaygg                                               - <210> SEQ ID NO 5                                                           <211> LENGTH: 24                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            <220> FEATURE:                                                                #A23> OTHER INFORMATION: N is G, C, T, or                                     - <400> SEQUENCE: 5                                                           #                24ggnc aygc                                                  - <210> SEQ ID NO 6                                                           <211> LENGTH: 27                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            <220> FEATURE:                                                                #A23> OTHER INFORMATION: N is G, C, T, or                                     - <400> SEQUENCE: 6                                                           #             27   ggrt cncgraa                                               - <210> SEQ ID NO 7                                                           <211> LENGTH: 24                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 7                                                           #                24atga acca                                                  - <210> SEQ ID NO 8                                                           <211> LENGTH: 23                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 8                                                           #                 23ccc aca                                                   - <210> SEQ ID NO 9                                                           <211> LENGTH: 29                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 9                                                           #            29    cgaa atcgagatg                                             - <210> SEQ ID NO 10                                                          <211> LENGTH: 42                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 10                                                          #  42              attg ccattgtagc aaatcttcaa ag                              - <210> SEQ ID NO 11                                                          <211> LENGTH: 30                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 11                                                          #           30     gcaa atcttcaaag                                            - <210> SEQ ID NO 12                                                          <211> LENGTH: 27                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 12                                                          #             27   agga gctgttc                                               - <210> SEQ ID NO 13                                                          <211> LENGTH: 27                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 13                                                          #             27   cagc tcgtcca                                               - <210> SEQ ID NO 14                                                          <211> LENGTH: 7938                                                            <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 14                                                          - agcttgaaaa aacctcccac acctccccct gaacctgaaa cataaaatga at - #gcaattgt         60                                                                          - tgttgttaac ttgtttattg cagcttataa tggttacaaa taaagcaata gc - #atcacaaa        120                                                                          - tttcacaaat aaagcatttt tttcactgca ttctagttgt ggtttgtcca aa - #ctcatcaa        180                                                                          - tgtatcttat catgtctgga tcgatcccgg caggttgggc gtcgcttggt cg - #gtcatttc        240                                                                          - gaaccccaga gtcccgctca gaagaactcg tcaagaaggc gatagaaggc ga - #tgcgctgc        300                                                                          - gaatcgggag cggcgatacc gtaaagcacg aggaagcggt cagcccattc gc - #cgccaagc        360                                                                          - tcttcagcaa tatcacgggt agccaacgct atgtcctgat agcggtccgc ca - #cacccagc        420                                                                          - cggccacagt cgatgaatcc agaaaagcgg ccattttcca ccatgatatt cg - #gcaagcag        480                                                                          - gcatcgccat gggtcacgac gagatcctcg ccgtcgggca tgcgcgcctt ga - #gcctggcg        540                                                                          - aacagttcgg ctggcgcgag cccctgatgc tcttcgtcca gatcatcctg at - #cgacaaga        600                                                                          - ccggcttcca tccgagtacg tgctcgctcg atgcgatgtt tcgcttggtg gt - #cgaatggg        660                                                                          - caggtagccg gatcaagcgt atgcagccgc cgcattgcat cagccatgat gg - #atactttc        720                                                                          - tcggcaggag caaggtgaga tgacaggaga tcctgccccg gcacttcgcc ca - #atagcagc        780                                                                          - cagtcccttc ccgcttcagt gacaacgtcg agcacagctg cgcaaggaac gc - #ccgtcgtg        840                                                                          - gccagccacg atagccgcgc tgcctcgtcc tgcagttcat tcagggcacc gg - #acaggtcg        900                                                                          - gtcttgacaa aaagaaccgg gcgcccctgc gctgacagcc ggaacacggc gg - #catcagag        960                                                                          - cagccgattg tctgttgtgc ccagtcatag ccgaatagcc tctccaccca ag - #cggccgga       1020                                                                          - gaacctgcgt gcaatccatc ttgttcaatc atgcgaaacg atcctcatcc tg - #tctcttga       1080                                                                          - tcagatccgg gacctgaaat aaaagacaaa aagactaaac ttaccagtta ac - #tttctggt       1140                                                                          - ttttcagttc ctcgaggagc tttttgcaaa agcctaggcc tccaaaaaag cc - #tcctcact       1200                                                                          - acttctggaa tagctcagag gccgaggcgg cctcggcctc tgcataaata aa - #aaaaatta       1260                                                                          - gtcagccatg gggcggagaa tgggcggaac tgggcggagt taggggcggg at - #gggcggag       1320                                                                          - ttaggggcgg gactatggtt gctgactaat tgagatgcat gctttgcata ct - #tctgcctg       1380                                                                          - ctggggagcc tggggacttt ccacacctgg ttgctgacta attgagatgc at - #gctttgca       1440                                                                          - tacttctgcc tgctggggag cctggggact ttccacaccc taactgacac ac - #attccaca       1500                                                                          - ggacattgat tattgactag ttagtccgcg aaatcgagat gctttgaaga tt - #aaaattaa       1560                                                                          - atttaatttt atgcgagact ggtttcctta ttttttgtat agtcgcatgc aa - #gcgaggtt       1620                                                                          - cgcataattt ggaaaataaa ggtagtcaag aagacgttga attaaggctg ca - #gtttcaaa       1680                                                                          - gtactctaca aacgattcct tttaaaaaaa aagattcaaa aaaaaggcaa ag - #ggtttaag       1740                                                                          - taatgcttgt tatttcaatt tacctccaaa cagttactaa tgcaattgca aa - #aaaaaaac       1800                                                                          - ctacctattg aatcaaaatt tctagcccat ccatcgctcc tcaagataaa gg - #aatcgata       1860                                                                          - ttttgagttt aagggagttg ctgatagatt tcagaattaa aaatttttgg aa - #aaggatgt       1920                                                                          - cgagaacaag aagatacgtc tagattgctg atgatgcatt ctagcagacg ga - #aatacaac       1980                                                                          - gatatgtgga cagcacgact tttgatccgt tcggatcaaa aggaagagaa at - #atccatct       2040                                                                          - ttcaagaaga atgcaggaaa agcaataaat gcccatttga ttcctaaatt at - #ccccaaaa       2100                                                                          - atgaacatta tgagatcttc ttgtgggaga caggaaattt cgcaattcca aa - #cgaaaatt       2160                                                                          - cggctctttt ttttacccca cagttgcggg gtaaatgatg taacggacct tg - #ggggaaag       2220                                                                          - gatgatgagt tagttgggaa gcggaaaaaa tggaaaacgg aagtaagaat ag - #aaaccagt       2280                                                                          - atggctgagt gcaatggcgg aaaagatttt acagagatga caagaatcta tt - #tatctata       2340                                                                          - aggaaaaact ttttccaaat ttgtctaaaa acgcattctc ctcaattgcc tc - #taggtaga       2400                                                                          - tgatataacg aattggaacg agacatcgct aaccggtttt ctttgtaaat ga - #cattttgt       2460                                                                          - agtgggagta agtttgaatg gagggataga cagatgaata gtatgagata ga - #agaatagt       2520                                                                          - atatataatg attaagatga acaaataaaa attgaaagaa aaaagaaatt gt - #tggctcat       2580                                                                          - ttggttcata cacatgttgg ttcatacaac ttttacccat cgtaagtatt at - #aagtaaaa       2640                                                                          - aatagagtac gaaaagctat aagtagtgaa gcaaaaaaat agaaaaatag aa - #aaaaaaat       2700                                                                          - atatataaaa aaatataata aaaataaaac tcataagaga cgtaaaacac aa - #gaattgtc       2760                                                                          - tatcatttgt tctttaagaa gcaccaccat tctgtaaaac tcttcatttc tc - #attagcaa       2820                                                                          - ggaccctttt cattccttcc tctttagaat ccttttcatt ataacgaatt gg - #ataatacg       2880                                                                          - caaataagaa cacatcccct aaatacgata tatcgatcca ttttttactt tg - #cctagctt       2940                                                                          - attgctgtac aattccattt aaatagtttc tcctcaagaa agatcgtcaa tg - #gaggcgac       3000                                                                          - aatataccgg aatttaagtt gcggacacag agcttgaaaa gactgcattt tg - #tattgttt       3060                                                                          - tcaagtaaat gaaactgagt tttgaagtct caaaatacat cttatgtatt ga - #acattaga       3120                                                                          - agaacatata agatagatct tgagagctca attcatcgac attctagcca tc - #atactgcg       3180                                                                          - atcttagaca ttgtcagcac aaccttagat cgaaaatgaa cacgttacca aa - #cgttgtct       3240                                                                          - aaaacttgcc gaatcttatc tccgcattac ttccgtaatc cttagtacat ac - #gctgcaat       3300                                                                          - ttcggaaggt catgatcgac tttttgtgta gctataagtg acgcaaatga ga - #aacatgac       3360                                                                          - aaggtgcgat atttagcaag atattatgca tttgatggag aaaggaaatt tc - #ggatgtat       3420                                                                          - atatagtacc gttagctgcg ctttttttgg tcatccataa ttttcaaact ca - #ctgctttc       3480                                                                          - gatcagattt accgttttta aggtctttat tgctttgtga tctgtaggtt gg - #aacatcta       3540                                                                          - tagttcattt tctaaaagat cctttcatcg tttcatcgga tagtaatcgt tc - #aagaaaaa       3600                                                                          - aaaagaaaaa aagaaaaaga aaaagaaaag aaaaataaac cgctataatt ca - #ttacctat       3660                                                                          - ttgactgaag gttcttcatc ttgaattgtt ttgaatcaaa ataaagaaat ta - #ttattatt       3720                                                                          - attttttttc ttcgcttttt ctttatccat tcgtcgaaac tatttttctg ct - #gataaaag       3780                                                                          - caatcattcc tttttcctgc ttctcttgtt attcgaattt taaacgactt tt - #tttcctcg       3840                                                                          - tccattccct aattctttgc gaccttttct gattctatcc ttggtttgta ct - #ttcgttgt       3900                                                                          - gtaattgttg agaaagtgaa ctgattattt aattgttgtg aaaaaaattc ta - #aaactatt       3960                                                                          - ttgtttttct tgatcattca tcctttgctc gcttgcttga atattacaga aa - #ttcgtctc       4020                                                                          - cctttcaacg gaatatgata atttgttgaa tactctaaat caattaacac ct - #atcaaaag       4080                                                                          - ctgaaacatt aaatctattc tcaccaaaaa aaaagactca agcttcttcg tt - #gttggccg       4140                                                                          - gtctcttttt tgttttacga ttgttaaatt ttatactcac aactgccaat tc - #tccacttt       4200                                                                          - tgactattta ttgatagtcc ctatttaatt ttctgttcac cgattatcgt ct - #tttttgta       4260                                                                          - aataatcttt cttggaacca accaattaat acgttataat cgctaacttt ga - #agatttgc       4320                                                                          - tacaatggct agcaagggcg aggagctgtt caccggggtg gtgcccatcc tg - #gtcgagct       4380                                                                          - ggacggcgac gtaaacggcc acaagttcag cgtgtccggc gagggcgagg gc - #gatgccac       4440                                                                          - ctacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg tg - #ccctggcc       4500                                                                          - caccctcgtg accaccctga cctacggcgt gcagtgcttc agccgctacc cc - #gaccacat       4560                                                                          - gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg ag - #cgcaccat       4620                                                                          - cttcttcaag gacgacggca actacaagac ccgcgccgag gtgaagttcg ag - #ggcgacac       4680                                                                          - cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca ac - #atcctggg       4740                                                                          - gcacaagctg gagtacaact acaacagcca caacgtctat atcatggccg ac - #aagcagaa       4800                                                                          - gaacggcatc aaggtgaact tcaagatccg ccacaacatc gaggacggca gc - #gtgcagct       4860                                                                          - cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc tg - #cccgacaa       4920                                                                          - ccactacctg agcacccagt ccgccctgag caaagacccc aacgagaagc gc - #gatcacat       4980                                                                          - ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg ag - #ctgtacaa       5040                                                                          - gtaagcttaa ataggaaagt ttcttcaaca ggattacagt gtagctacct ac - #atgctgaa       5100                                                                          - aaatatagcc tttaaatcat ttttatatta taactctgta taatagagat aa - #gtccattt       5160                                                                          - tttaaaaatg ttttccccaa accataaaac cctatacaag ttgttctagt aa - #caatacat       5220                                                                          - gagaaagatg tctatgtagc tgaaaataaa atgacgtcac aagacaaaaa aa - #aaaaaaaa       5280                                                                          - aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaag ta - #ccttctga       5340                                                                          - ggcggaaaga accagccgga tccagacatg ataagataca ttgatgagtt tg - #gacaaacc       5400                                                                          - acaactagaa tgcagtgaaa aaaatgcttt atttgtgaaa tttgtgatgc ta - #ttgcttta       5460                                                                          - tttgtaacca ttataagctg caataaacaa gttaacaaca acaattgcat tc - #attttatg       5520                                                                          - tttcaggttc agggggaggt gtgggaggtt ttttaaagca agtaaaacct ct - #acaaatgt       5580                                                                          - ggtatggctg attatgatcc ggctgcctcg cgcgtttcgg tgatgacggt ga - #aaacctct       5640                                                                          - gacacatgca gctcccggag acggtcacag cttgtctgta agcggatgcc gg - #gagcagac       5700                                                                          - aagcccgtca gggcgcgtca gcgggtgttg gcgggtgtcg gggcgcagcc at - #gacccagt       5760                                                                          - cacgtagcga tagcggagtg tatactggct taactatgcg gcatcagagc ag - #attgtact       5820                                                                          - gagagtgcac catatgcggt gtgaaatacc gcacagatgc gtaaggagaa aa - #taccgcat       5880                                                                          - caggcgctct tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc gg - #ctgcggcg       5940                                                                          - agcggtatca gctcactcaa aggcggtaat acggttatcc acagaatcag gg - #gataacgc       6000                                                                          - aggaaagaat gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cg - #cgttgctg       6060                                                                          - gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgacg ct - #caagtcag       6120                                                                          - aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg aa - #gctccctc       6180                                                                          - gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt tc - #tcccttcg       6240                                                                          - ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt gt - #aggtcgtt       6300                                                                          - cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cg - #ccttatcc       6360                                                                          - ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact gg - #cagcagcc       6420                                                                          - actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt ct - #tgaagtgg       6480                                                                          - tggcctaact acggctacac tagaaggaca gtatttggta tctgcgctct gc - #tgaagcca       6540                                                                          - gttaccttcg gaaaaagagt tggtagctct tgatccggca aacaaaccac cg - #ctggtagc       6600                                                                          - ggtggttttt ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tc - #aagaagat       6660                                                                          - cctttgatct tttctacggg gtctgacgct cagtggaacg aaaactcacg tt - #aagggatt       6720                                                                          - ttggtcatga gattatcaaa aaggatcttc acctagatcc ttttaaatta aa - #aatgaagt       6780                                                                          - tttaaatcaa tctaaagtat atatgagtaa acttggtctg acagttacca at - #gcttaatc       6840                                                                          - agtgaggcac ctatctcagc gatctgtcta tttcgttcat ccatagttgc ct - #gactcccc       6900                                                                          - gtcgtgtaga taactacgat acgggagggc ttaccatctg gccccagtgc tg - #caatgata       6960                                                                          - ccgcgagacc cacgctcacc ggctccagat ttatcagcaa taaaccagcc ag - #ccggaagg       7020                                                                          - gccgagcgca gaagtggtcc tgcaacttta tccgcctcca tccagtctat ta - #attgttgc       7080                                                                          - cgggaagcta gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tg - #ccattgct       7140                                                                          - gcaggcatcg tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cg - #gttcccaa       7200                                                                          - cgatcaaggc gagttacatg atcccccatg ttgtgcaaaa aagcggttag ct - #ccttcggt       7260                                                                          - cctccgatcg ttgtcagaag taagttggcc gcagtgttat cactcatggt ta - #tggcagca       7320                                                                          - ctgcataatt ctcttactgt catgccatcc gtaagatgct tttctgtgac tg - #gtgagtac       7380                                                                          - tcaaccaagt cattctgaga atagtgtatg cggcgaccga gttgctcttg cc - #cggcgtca       7440                                                                          - acacgggata ataccgcgcc acatagcaga actttaaaag tgctcatcat tg - #gaaaacgt       7500                                                                          - tcttcggggc gaaaactctc aaggatctta ccgctgttga gatccagttc ga - #tgtaaccc       7560                                                                          - actcgtgcac ccaactgatc ttcagcatct tttactttca ccagcgtttc tg - #ggtgagca       7620                                                                          - aaaacaggaa ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa at - #gttgaata       7680                                                                          - ctcatactct tcctttttca atattattga agcatttatc agggttattg tc - #tcatgagc       7740                                                                          - ggatacatat ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg ca - #catttccc       7800                                                                          - cgaaaagtgc cacctgacgt ctaagaaacc attattatca tgacattaac ct - #ataaaaat       7860                                                                          - aggcgtatca cgaggccctt tcgtcttcaa gaattggtcg accaattctc at - #gtttgaca       7920                                                                          #7938              ct                                                         - <210> SEQ ID NO 15                                                          <211> LENGTH: 29                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 15                                                          #            29    cgaa atcgagatg                                             - <210> SEQ ID NO 16                                                          <211> LENGTH: 42                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 16                                                          #  42              attg ccattgtagc aaatcttcaa ag                              - <210> SEQ ID NO 17                                                          <211> LENGTH: 28                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 17                                                          #             28   atat attttagc                                              - <210> SEQ ID NO 18                                                          <211> LENGTH: 27                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 18                                                          #             27   ccag atagtct                                               - <210> SEQ ID NO 19                                                          <211> LENGTH: 20                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 19                                                          # 20               agta                                                       - <210> SEQ ID NO 20                                                          <211> LENGTH: 29                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 20                                                          #            29    ttta catataagt                                             - <210> SEQ ID NO 21                                                          <211> LENGTH: 30                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 21                                                          #           30     cccc aacctcttca                                            - <210> SEQ ID NO 22                                                          <211> LENGTH: 20                                                              <212> TYPE: DNA                                                               <213> ORGANISM: Artificial Sequence                                           <220> FEATURE:                                                                #Sequence:DNANFORMATION: Description of Artificial                            - <400> SEQUENCE: 22                                                          # 20               tata                                                       - <210> SEQ ID NO 23                                                          <211> LENGTH: 332                                                             <212> TYPE: PRT                                                               <213> ORGANISM: Schwanniomyces occidentalis                                   - <400> SEQUENCE: 23                                                          - Pro Leu Thr Thr Thr Phe Phe Gly Tyr Val Al - #a Ser Ser Ser Ile Asp         #                 15                                                          - Leu Ser Val Asp Thr Ser Glu Tyr Asn Arg Pr - #o Leu Ile His Phe Thr         #             30                                                              - Pro Glu Lys Gly Trp Met Asn Asp Pro Asn Gl - #y Thr Phe Tyr Asp Lys         #         45                                                                  - Thr Ala Lys Thr Trp His Leu Tyr Phe Gln Ty - #r Asn Pro Asn Ala Thr         #     60                                                                      - Ala Trp Gly Gln Pro Leu Tyr Trp Gly His Al - #a Thr Ser Asn Asp Leu         # 80                                                                          - Val His Trp Asp Glu His Glu Met Ala Ile Gl - #y Pro Glu His Asp Asn         #                 95                                                          - Glu Gly Ile Phe Ser Gly Ser Ile Val Val As - #p His Asn Asn Thr Ser         #           110                                                               - Gly Phe Phe Asn Ser Ser Ile Asp Pro Asn Gl - #n Arg Ile Val Ala Ile         #       125                                                                   - Tyr Thr Asn Asn Met Pro Asp Leu Gln Thr Gl - #n Asp Ile Ala Phe Ser         #   140                                                                       - Leu Asp Gly Gly Tyr Thr Phe Thr Lys Tyr Gl - #u Asn Asn Pro Val Ile         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Asp Val Ser Ser Asn Gln Phe Arg Asp Pro Ly - #s Val Phe Trp His Glu         #               175                                                           - Arg Phe Lys Ser Met Asp His Gly Cys Ser Gl - #u Ile Ala Arg Val Lys         #           190                                                               - Ile Gln Ile Phe Gly Ser Ala Asn Leu Lys As - #n Trp Val Leu Asn Ser         #       205                                                                   - Asn Phe Ser Ser Gly Tyr Tyr Gly Asn Gln Ty - #r Gly Met Ser Arg Leu         #   220                                                                       - Ile Glu Val Pro Ile Glu Asn Ser Asp Lys Se - #r Lys Trp Val Met Phe         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Leu Ala Ile Asn Pro Gly Ser Pro Leu Gly Gl - #y Ser Ile Asn Gln Tyr         #               255                                                           - Phe Val Gly Asp Phe Asp Gly Phe Gln Phe Va - #l Pro Asp Asp Ser Gln         #           270                                                               - Thr Arg Phe Val Asp Ile Gly Lys Asp Phe Ty - #r Ala Phe Gln Thr Phe         #       285                                                                   - Ser Glu Val Glu His Gly Val Leu Gly Leu Al - #a Trp Ala Ser Asn Trp         #   300                                                                       - Gln Tyr Ala Asp Gln Val Pro Thr Asn Pro Tr - #p Arg Ser Ser Thr Ser         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Leu Ala Arg Asn Tyr Thr Leu Arg Tyr Val Me - #t Gln                         #               330                                                           - <210> SEQ ID NO 24                                                          <211> LENGTH: 337                                                             <212> TYPE: PRT                                                               <213> ORGANISM: Saccharomyces cerevisiae                                      - <400> SEQUENCE: 24                                                          - Leu Gln Ala Phe Thr Phe Thr Leu Ala Gly Ph - #e Ala Ala Lys Met Ser         #                 15                                                          - Ala Ser Met Thr Asn Glu Thr Ser Asp Arg Pr - #o Leu Val His Phe Thr         #             30                                                              - Pro Asn Lys Gly Trp Met Asn Asp Pro Asn Gl - #y Leu Trp Tyr Asp Glu         #         45                                                                  - Lys Asp Ala Lys Trp His Thr Tyr Phe Gln Ty - #r Asn Pro Asn Asp Thr         #     60                                                                      - Val Trp Gly Thr Pro Leu Phe Trp Gly His Al - #a Thr Ser Asp Asp Leu         # 80                                                                          - Thr Asn Trp Glu Asp Gln Pro Ile Ala Ile Al - #a Pro Lys Arg Asn Asp         #                 95                                                          - Ser Gly Ala Phe Ser Gly Ser Met Val Val As - #p Tyr Asn Asn Thr Ser         #           110                                                               - Gly Phe Phe Asn Asp Thr Ile Asp Pro Arg Gl - #n Arg Cys Val Ala Ile         #       125                                                                   - Trp Thr Tyr Asn Thr Pro Glu Ser Glu Glu Gl - #n Tyr Ile Ser Tyr Ser         #   140                                                                       - Thr Asp Gly Gly Tyr Thr Phe Thr Glu Tyr Gl - #n Lys Asn Pro Val Leu         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Ala Ala Asn Ser Thr Gln Phe Arg Asp Pro Ly - #s Val Phe Trp Tyr Glu         #               175                                                           - Pro Ser Gln Lys Trp Ile Met Thr Ala Ala Ly - #s Ser Gln Asp Tyr Lys         #           190                                                               - Ile Glu Ile Tyr Ser Ser Asp Asp Leu Lys Se - #r Trp Lys Thr Glu Ser         #       205                                                                   - Ala Phe Ala Asn Glu Gly Phe Leu Gly Tyr Gl - #n Tyr Glu Cys Pro Gly         #   220                                                                       - Leu Ile Glu Val Pro Thr Glu Gln Asp Pro Se - #r Lys Ser Tyr Trp Val         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Met Phe Ile Ser Ile Asn Pro Gly Ala Pro Al - #a Gly Gly Ser Phe Asn         #               255                                                           - Gln Tyr Phe Val Gly Ser Phe Asn Gly Thr Hi - #s Phe Glu Ala Phe Asp         #           270                                                               - Asn Gln Ser Arg Val Val Asp Phe Gly Lys As - #p Tyr Tyr Ala Leu Gln         #       285                                                                   - Thr Phe Phe Asn Thr Asp Pro Thr Tyr Gly Se - #r Ala Leu Gly Ile Ala         #   300                                                                       - Trp Ala Ser Asn Trp Glu Tyr Ser Ala Phe Va - #l Pro Thr Asn Pro Trp         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Arg Ser Ser Met Ser Leu Val Arg Lys Phe Se - #r Leu Asn Thr Glu Tyr         #               335                                                           - Gln                                                                         __________________________________________________________________________

What is claimed is:
 1. An isolated DNA having the base sequence of bases1 to 2809 in SEQ ID NO: 1 in the Sequence Listing.
 2. An isolated DNAencoding a polypeptide consisting essentially of amino acids 1-22 of SEQID NO:
 2. 3. A recombinant vector containing the sequence of the DNAaccording to claim 1 or
 2. 4. A cloning vector containing the sequenceof the DNA according to claim 1 or 2 and a multicloning site.
 5. Acloning vector having the structure shown in FIG.
 9. 6. An expressionvector containing the sequence of the DNA according to claim 1 or 2 anda heterologous protein structural gene.
 7. A Schizosaccharomyces pombetransformant carrying the expression vector according to claim
 6. 8. Aprocess for producing a protein which comprises incubating thetransformant according to claim 7 and recovering an expressedheterologous protein.
 9. The DNA of claim 2, wherein the polypeptideconsists of amino acids 1-22 of SEQ ID NO:
 2. 10. An expression vectorcontaining the DNA of claim 9 and a heterologous protein structuralgene.
 11. A Schizosaccharomyces pombe transformant carrying theexpression vector of claim
 10. 12. A method of producing a protein,comprising incubating the transformant of claim 11 and recovering anexpressed heterologous protein.